WO2020134678A1 - 容灾方法、装置及系统 - Google Patents

容灾方法、装置及系统 Download PDF

Info

Publication number
WO2020134678A1
WO2020134678A1 PCT/CN2019/118577 CN2019118577W WO2020134678A1 WO 2020134678 A1 WO2020134678 A1 WO 2020134678A1 CN 2019118577 W CN2019118577 W CN 2019118577W WO 2020134678 A1 WO2020134678 A1 WO 2020134678A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
disaster recovery
site
production
synchronized
Prior art date
Application number
PCT/CN2019/118577
Other languages
English (en)
French (fr)
Inventor
刘新宇
毛士玲
孙琼华
张文俊
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020134678A1 publication Critical patent/WO2020134678A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances

Definitions

  • the invention relates to the field of computer disaster recovery, in particular to a disaster recovery method, device and system.
  • the traditional virtual machine backup method adopted by the virtualized cloud platform is to perform full-volume and incremental backup of virtual machines in a data center. Obviously, this backup mode is performed in a data center, so the security is low, and the requirements for the security and reliability of cloud platform data are no longer available.
  • the embodiments of the present application provide a disaster recovery method, which is applied to a disaster recovery system.
  • the disaster recovery system includes a production data center and a disaster recovery data center.
  • the production data center creates a production site, and the disaster recovery data A disaster recovery site is created in the center; the method includes: synchronizing the first configuration information of the first virtual machine on the production site to the standby site on the disaster recovery site while the production data center is running normally Synchronous virtual machine; when an operation failure of the production data center is detected, the first production business of the production site is switched to the disaster recovery site.
  • the embodiments of the present application provide a disaster recovery device, which is applied to a disaster recovery system.
  • the disaster recovery system includes a production data center and a disaster recovery data center.
  • a production site is created in the production data center, and the disaster recovery A disaster recovery site is created in the data center;
  • the device includes: a first synchronization module for synchronizing the first configuration information of the first virtual machine on the production site to the production data center in a normal operation state to A virtual machine to be synchronized on the disaster recovery site; a first switching module, configured to switch the first production service of the production site to the disaster recovery site when an operation failure of the production data center is detected.
  • an embodiment of the present application provides a disaster recovery system, including a production data center and a disaster recovery data center; a production site is created in the production data center, and a disaster recovery site is created in the disaster recovery data center; the production The data center and the disaster recovery data center each include: a disaster recovery module DRM for synchronizing the first configuration information of the first virtual machine on the production site to all A virtual machine to be synchronized on the disaster recovery site; a resource operation system iROS, which is used to switch the first production business of the production site to the disaster recovery site when a failure of the production data center is detected; storage equipment For storing the first configuration information of the first virtual machine.
  • a disaster recovery module DRM for synchronizing the first configuration information of the first virtual machine on the production site to all A virtual machine to be synchronized on the disaster recovery site
  • iROS resource operation system
  • storage equipment For storing the first configuration information of the first virtual machine.
  • an embodiment of the present application provides a disaster recovery device, which is applied to a disaster recovery system.
  • the disaster recovery system includes a production data center and a disaster recovery data center.
  • the production data center has a production site, and the disaster recovery A disaster recovery site is created in the data center;
  • the equipment includes: a processor; and a memory arranged to store computer-executable instructions that, when executed, make the processor: normal in the production data center In the running state, synchronize the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site; when an operation failure of the production data center is detected, the The first production service of the production site is switched to the disaster recovery site.
  • an embodiment of the present application provides a storage medium for storing computer-executable instructions.
  • the executable instructions When executed, the following process is implemented: In the normal operation state of the production data center, the The first configuration information of the first virtual machine is synchronized to the to-be-synchronized virtual machine on the disaster recovery site; when an operation failure of the production data center is detected, the first production service of the production site is switched to the disaster recovery site .
  • an embodiment of the present application provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are When executed by a computer, the computer is caused to perform the method described in the above aspects.
  • FIG. 1 is a schematic flowchart of a disaster recovery method according to an embodiment of the present invention
  • FIG. 2 is a schematic block diagram of a disaster recovery device according to an embodiment of the present invention.
  • FIG. 3 is a schematic block diagram of a disaster recovery system according to an embodiment of the present invention.
  • FIG. 4 is a schematic block diagram of a disaster recovery device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a disaster recovery system according to an embodiment of the present invention.
  • Embodiments of the present application provide a disaster recovery method, device, and system for implementing a data backup mode under multiple data centers in a virtualized cloud platform, thereby improving the security and reliability of cloud platform data.
  • FIG. 1 is a schematic flowchart of a disaster recovery method according to an embodiment of the present invention.
  • the method is applied to a disaster recovery system.
  • the disaster recovery system includes a production data center and a disaster recovery data center.
  • a production site is created in the production data center.
  • Disaster recovery data centers have created disaster recovery sites.
  • the disaster recovery method includes: S102. Synchronizing the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site when the production data center is in a normal operation state.
  • FIG. 2 shows a disaster recovery system applicable to the disaster recovery method.
  • the disaster recovery system includes: using the technical solution of the embodiment of the present invention, the production site can be The first configuration information of the first virtual machine is synchronized to the to-be-synchronized virtual machine on the disaster recovery site, and then when the operation failure of the production data center is detected, the first production service on the production site is switched to the disaster recovery site.
  • the technical solution adopts a data synchronization solution under multiple data centers, that is, the production data center is synchronized to the disaster recovery data center, thus improving the security and reliability of the data and meeting the cloud platform data.
  • Safety and reliability requirements Especially for certain industries such as banks, insurance, and other related financial industries that have high requirements for data security and reliability, this technical solution can greatly satisfy these industries because it improves the security and reliability of cloud platform data. Requirements for data security and reliability.
  • the second configuration information of the disaster recovery virtual machine on the disaster recovery site is reversed to the production Synchronize the virtual machines on the site to be reversed, and then switch the second production service of the disaster recovery site to the production site.
  • the specific configuration content includes:
  • iROS Resource Operating System
  • the two sets of iROS are the primary and disaster recovery relationships.
  • the iROS of the production site is mainly used for disaster recovery.
  • the iROS at the site is disaster prepared.
  • Each storage device at the production site and disaster recovery site is divided into a LUN (Logical Unit Number, logical unit number) device of the same size, and the two LUN devices are configured in a synchronous or asynchronous replication relationship.
  • LUN Logical Unit Number, logical unit number
  • Tenants are equipped with vlan (Virtual Local Area Network, virtual local area network) type port group networks on the associated production and disaster recovery sites.
  • vlan Virtual Local Area Network, virtual local area network
  • iROS disaster management site management, site pair management and protection group management can be configured separately. Specifically, it can include the following contents: a. Create a production site and a disaster recovery site separately in site management, and production iROS and disaster recovery iROS send the newly added site information to the corresponding DRM (Disaster Recovery Management).
  • DRM Disaster Recovery Management
  • the authentication of each site created can be displayed Information such as url, authentication username and online status.
  • information such as the created site pair and network mapping relationship can be displayed.
  • the protection group is the smallest operation unit when the production site and the disaster recovery site are switched in the disaster recovery management.
  • a group of virtual machines in the protection group are switched or switched back at the same time when the site is switched.
  • production iROS and disaster recovery iROS send the protection group creation information to the corresponding DRM.
  • the name of the protection group created is "g2-126-103", and the selected site pairs are: “126-DataCenter-active” and "103-DataCenter-same city disaster recovery” site pair.
  • the resource pool of the production site “126-DataCenter-active” is “pool2-Huaxiang computer room", the storage library is “FC-MASTER1"; the resource pool of the disaster recovery site “103-DataCenter-same city disaster recovery” is “ “pool2-disaster recovery in the same city”, the repository is "ibm-fc-s1".
  • the system After the disaster recovery function is enabled on the virtual machine at the production site, the system automatically creates the corresponding disaster recovery virtual machine at the disaster recovery site.
  • the first configuration information of the first virtual machine on the production site is synchronized to the virtual machine to be synchronized on the disaster recovery site.
  • the first configuration information includes resource configuration information and disk configuration information of the first virtual machine, and the resource configuration information includes site information, site pair information, protection group information, virtual machine information in the protection group, CPU information, memory information, and network card Information such as disk configuration information includes disk operation information, disk snapshot information, disk snapshot recovery information, virtual machine clone information, virtual machine backup information, virtual machine backup and recovery information, etc.
  • DRM compares the virtual machine configuration information of the production site and the disaster recovery site regularly (for example, according to a preset frequency), and when it is found that the virtual machine configuration information on the disaster recovery site and the virtual machine configuration information on the production site are inconsistent , Trigger the disaster recovery site to adjust the resources of the synchronized virtual machine according to the resource configuration information of the first virtual machine at the production site. Furthermore, it is monitored whether the resource adjustment information of the disaster recovery site is successfully adjusted; if it is not successful, the disaster recovery site is triggered again to perform resource adjustment on the synchronized virtual machine according to the resource configuration information of the first virtual machine.
  • DRM triggers the disaster recovery site to adjust the resource of the virtual machine to be synchronized according to the disk configuration information of the first virtual machine at the production site in real time.
  • the disaster recovery site and the production site are adjusted accordingly.
  • the configuration information of the virtual machine on the system can maintain consistency and ensure the consistency of data during site switching.
  • the virtual machine to be synchronized when synchronizing the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site, the virtual machine to be synchronized needs to be determined first, and then the first configuration information is synchronized to the Determine the virtual machine to be synchronized.
  • the steps to determine the virtual machines to be synchronized are as follows: first, obtain the preset virtual machine list of the disaster recovery system, and obtain the synchronized virtual machine list corresponding to the first virtual machine; second, update the synchronized virtual machine list according to the preset virtual machine list ; Again, add the virtual machines in the list of synchronized virtual machines to the list of virtual machines to be synchronized to obtain the virtual machines to be synchronized.
  • the protection group list of the disaster recovery system finds the protection group list of the disaster recovery system and determine whether the protection group list is empty. Since the protection group list of the disaster recovery system has been configured in the above embodiment, the protection group list found here is not empty. If the protection group list is not empty, initialize the list of virtual machines to be synchronized.
  • the protection group traverse the protection group to determine whether the status of the production site is "protecting” (that is, whether the protection group is enabled), and if the status of the production site is "protecting", then further determine whether the status of the disaster recovery site is "protecting” . If the state of the disaster recovery site is also "Protected”, obtain the virtual machine list of the protection group (that is, the preset virtual machine list), and obtain the list of synchronized virtual machines corresponding to the first virtual machine at the production site, and then according to the protection group The virtual machine list of is updated to the list of synchronized virtual machines, and the records in the list of synchronized virtual machines are added to the list of virtual machines to be synchronized.
  • the following three methods may be included: 1. For the virtual machine list of the protection group (that is, the preset virtual machine list), and the synchronized virtual machine list If the virtual machine does not exist, add it to the synchronized virtual machine list; 2. For the virtual machine that does not exist in the virtual machine list of the protection group and exists in the synchronized virtual machine list, delete it from the synchronized virtual machine list; 3. No operation is performed on the virtual machines that exist in the virtual machine list of the protection group and also exist in the synchronized virtual machine list.
  • the update process of the synchronized virtual machine list determines the current synchronization state. For example, if a virtual machine is added to the synchronized virtual machine list, the current synchronization state is "newly added”; if a virtual machine is deleted from the synchronized virtual machine list, the current The synchronization status is "to be deleted” or “deleting”; if the update operation is completed, the current synchronization status is "last task successful”; and so on.
  • the first configuration information of the first virtual machine on the production site is synchronized to the determined virtual machine to be synchronized.
  • the VMC Virtual Management Center, Virtualization Management Center
  • the interface of the destination site is called to delete the virtual machine to be deleted. If the interface returns successfully, the virtual machine is removed from the cache, and the current synchronization status is changed to "deleting", and then continue to traverse the list of virtual machines to be synchronized. If the current synchronization status is not "to be deleted”, it is further determined whether the current synchronization status is "deleting".
  • the interface of the destination site is called to query the information of the deleted virtual machine. If the virtual machine is found to exist, the current synchronization status is changed to "to be deleted”. And call the interface of the destination site to delete the virtual machine to be deleted. If the result of the query is that the virtual machine does not exist, it means that the virtual machine has been successfully deleted. At this time, the current synchronization status is changed to "deleted", and then continue to traverse the list of virtual machines to be synchronized. If the current synchronization status is not "deleting”, it is further determined whether the current synchronization status is "last task successful" or "no change".
  • the VMC interface of the current site is called to query the detailed information of the virtual machine, and the information of the queried virtual machine and the cache of the current site are judged Whether the virtual machine information is the same; if they are different, call the interface of the destination site to adjust the virtual machine resources, and when the interface returns successfully, change the current synchronization status to "resource adjustment", and then continue to traverse the list of virtual machines to be synchronized; if the interface If the return is unsuccessful, continue to traverse the list of virtual machines to be synchronized; if the queried virtual machine information is the same as the virtual machine information in the cache of the current site, change the current synchronization status to "no change", and then continue to traverse the virtual machines to be synchronized List.
  • the interface of the destination site is called to query the virtual machine details And determine whether the queried virtual machine information is the same as the virtual machine information in the current site cache.
  • the VMC interface of the current site is called to query the detailed information of the virtual machine, and the virtual machine information of the destination site is compared with the virtual machine information of the current site to Determine whether they are consistent; if they are consistent, change the current synchronization status to "no change", if not, call the destination site interface to adjust the resource, and change the current synchronization status to "resource adjustment", and then continue to traverse the pending synchronization virtual List.
  • the current synchronization status is not "Resource Adjustment”, continue to traverse the list of virtual machines to be synchronized. If the current synchronization status is "Creating”, the VMC interface of the current site is called to query the details of the virtual machine, and the destination site interface is called again to create the virtual machine. When the interface returns successfully, the current synchronization status is changed to "Creating” , And then continue to traverse the list of virtual machines to be synchronized.
  • the current site is the production site
  • the destination site is the disaster recovery site.
  • the virtual machine to be reverse synchronized needs to be determined first, and then the The second configuration information is synchronized to the determined virtual machine to be reversely synchronized.
  • the steps for determining the virtual machines to be reversely synchronized are as follows: first, obtain the preset virtual machine list of the disaster recovery system, and obtain the reverse synchronization virtual machine list corresponding to the disaster recovery virtual machine; second, update according to the preset virtual machine list Reverse synchronization virtual machine list; again, the virtual machine in the reverse synchronization virtual machine list is added to the virtual machine list to be reversely synchronized to obtain the virtual machine to be reversely synchronized.
  • the protection group list of the disaster recovery system finds the protection group list of the disaster recovery system and determine whether the protection group list is empty. Since the protection group list of the disaster recovery system has been configured in the above embodiment, the protection group list found here is not empty. If the protection group list is not empty, the virtual machine list to be synchronized in reverse is initialized.
  • the protection group traverse the protection group to determine whether the state of the disaster recovery site is "protecting” (that is, whether the protection group is enabled), and if the state of the disaster recovery site is "protecting", then further determine whether the status of the production site is "protecting” ". If the status of the production site is also "Protected”, obtain the virtual machine list of the protection group (that is, the preset virtual machine list), and obtain the reverse synchronization virtual machine list corresponding to the disaster recovery virtual machine, and then according to the virtuality of the protection group The machine list updates the reverse synchronization virtual machine list, and adds the records in the reverse synchronization virtual machine list to the virtual machine list to be reverse synchronized.
  • the reverse synchronization virtual machine list is updated according to the virtual machine list of the protection group
  • the following three methods may be included: 1. For the virtual machine list of the protection group (that is, the preset virtual machine list), and the reverse synchronization If the virtual machine does not exist in the virtual machine list, add it to the reverse synchronization virtual machine list; Second, for the virtual machine that does not exist in the protection group's virtual machine list and exists in the synchronization virtual machine list, remove it from the reverse Delete it from the synchronous virtual machine list; 3. Do not operate the virtual machine that exists in the virtual machine list of the protection group and also exists in the reverse synchronization virtual machine list.
  • the update process of the reverse synchronization virtual machine list determines the current reverse synchronization state. For example, if a virtual machine is added to the reverse synchronization virtual machine list, the current reverse synchronization state is "newly added”; If a virtual machine is deleted from the virtual machine list, the current reverse synchronization status is "to be deleted” or “deleting”; if the update operation is completed, the current reverse synchronization status is "last task successful”; and so on.
  • the second configuration information of the disaster recovery virtual machine is reversely synchronized to the determined virtual machine to be reversely synchronized.
  • the interface of the destination site is called to query the information of the deleted virtual machine. If the virtual machine is found to exist, the current synchronization status is changed to "to be deleted” ", and then continue to traverse the list of virtual machines to be synchronized in reverse. If the query result is that the virtual machine does not exist, it means that the virtual machine has been successfully deleted. At this time, the current reverse synchronization state is changed to "deleted”, and then continue to traverse the list of virtual machines to be reverse synchronized. If the current reverse synchronization state is not "deleting”, it is further determined whether the current reverse synchronization state is "creating".
  • the interface of the destination site is called to query the detailed information of the virtual machine and determine whether the virtual machine has been created in the destination site. If the virtual machine has been created, the current site is called The VMC interface to query the detailed information of the virtual machine and determine whether the queried virtual machine information is consistent with the virtual machine information in the current site cache. If they are consistent, the current reverse synchronization status is changed to "Last task successful", and then Continue to traverse the list of virtual machines to be synchronized in reverse; if they are inconsistent, call the local site interface to adjust the virtual machine resources, and change the current reverse synchronization state to "resource adjustment", and then continue to traverse the list of virtual machines to be synchronized in reverse.
  • the interface of the destination site and the VMC interface of the current site are called to query the detailed information of the virtual machine, and to determine the virtual machine information of the destination site and the virtual machine of the current site Whether the information is consistent, if they are consistent, change the current reverse synchronization status to "Last task succeeded", and then continue to traverse the list of virtual machines to be reversely synchronized; if they are inconsistent, call the local site interface to adjust the virtual machine resources and change the current The reverse synchronization status changes to "Resource adjustment", and then continue to traverse the list of virtual machines to be reverse synchronized.
  • the interface of the destination site and the VMC interface of the current site are called to query the detailed information of the virtual machine and determine whether the virtual machine information of the destination site is consistent with the virtual machine information of the current site. If they are consistent, continue to traverse the list of virtual machines to be reverse synchronized; if they are not consistent, call the local site interface to adjust the virtual machine resources and change the current reverse synchronization state to "resource adjustment", and then continue to traverse the virtual machines to be reverse synchronized List.
  • the current site is the disaster recovery site
  • the destination site is the production site.
  • the second configuration information includes the difference resource information between the disaster recovery virtual machine and the virtual machine to be reversely synchronized. Therefore, the second configuration information of the disaster recovery virtual machine on the disaster recovery site is reverse synchronized When going to the virtual machine to be reversely synchronized on the production site, you can compare the resource information of the disaster recovery virtual machine and the virtual machine to be reverse synchronized, and determine the disaster recovery virtual machine and the virtual machine according to the comparison result of the resource information comparison Reverse synchronization of the difference resource information between the virtual machines, and then adjust the resources of the reverse synchronization virtual machine according to the difference resource information.
  • the second configuration information of the disaster recovery virtual machine on the disaster recovery site is reverse synchronized to the virtual machine to be reverse synchronized on the production site
  • the second production service of the disaster recovery site is switched to the production site.
  • the embodiment of the present application further provides a disaster recovery device.
  • FIG. 2 is a schematic block diagram of a disaster recovery device according to an embodiment of the present invention.
  • the device is applied to a disaster recovery system.
  • the disaster recovery system includes a production data center and a disaster recovery data center.
  • a production site is created in the production data center.
  • Disaster recovery sites are created in disaster data centers.
  • the disaster recovery apparatus 200 includes a first synchronization module 210 for synchronizing the first configuration information of the first virtual machine on the production site to the disaster recovery site under the normal operation of the production data center Virtual machine to be synchronized.
  • the first switching module 220 is configured to switch the first production service of the production site to the disaster recovery site when the operation failure of the production data center is detected.
  • the apparatus 200 further includes: a second synchronization module, configured to switch the production service of the production site to the disaster recovery site, and when the production data center is monitored to resume operation, the disaster recovery on the disaster recovery site
  • the second configuration information of the virtual machine is reversely synchronized to the to-be-reversely synchronized virtual machine on the production site; the second switching module is used to switch the second production service of the disaster recovery site to the production site.
  • the first synchronization module 210 includes: a first determination unit to determine the virtual machine to be synchronized; a first synchronization unit to synchronize the first configuration information to the determined virtual machine to be synchronized; wherein, The first determining unit is used to: obtain a preset virtual machine list of the disaster recovery system; and, obtain a synchronized virtual machine list corresponding to the first virtual machine; update the synchronized virtual machine list according to the preset virtual machine list; add the synchronized virtual machine list The virtual machine is added to the list of virtual machines to be synchronized to obtain the virtual machine to be synchronized.
  • the first determining unit is further configured to: add a virtual machine that exists in the preset virtual machine list and does not exist in the synchronized virtual machine list to the synchronized virtual machine list; The virtual machines that exist and exist in the synchronized virtual machine list are deleted from the synchronized virtual machine list.
  • the second synchronization module includes: a second determination unit for determining the virtual machine to be synchronized in reverse; a second synchronization unit for synchronizing the second configuration information to the determined virtual machine to be synchronized in reverse
  • the second determining unit is used to: obtain the preset virtual machine list of the disaster recovery system; and, obtain the reverse synchronization virtual machine list corresponding to the disaster recovery virtual machine; update the reverse synchronization virtual machine list according to the preset virtual machine list ; Add the virtual machines in the reverse synchronization virtual machine list to the virtual machine list to be reversely synchronized to obtain the virtual machine to be reversely synchronized.
  • the first configuration information includes resource configuration information of the first virtual machine;
  • the resource configuration information includes site information, site pair information, protection group information, virtual machine information in the protection group, CPU information, memory information, network card At least one item of information;
  • the first synchronization module 210 includes: a first trigger unit configured to trigger the disaster recovery site to perform resource adjustment on the virtual machine to be synchronized according to the resource configuration information of the first virtual machine according to the first preset frequency; the monitoring unit Is used to monitor whether the resource adjustment information of the disaster recovery site is successfully adjusted; the second trigger unit is used to trigger the disaster recovery site again according to the first virtual machine if the resource adjustment information of the disaster recovery site is not successful
  • the resource configuration information adjusts resources of the virtual machine to be synchronized.
  • the first configuration information includes disk configuration information of the first virtual machine;
  • the disk configuration information includes disk operation information, disk snapshot information, disk snapshot recovery information, virtual machine clone information, virtual machine backup information, virtual machine backup At least one item of recovery information;
  • the first synchronization module 210 includes: a third trigger unit, configured to trigger the disaster recovery site to perform resource adjustment on the virtual machine to be synchronized according to the disk configuration information of the first virtual machine.
  • the second synchronization module is used for: a comparison unit for comparing resource information between the disaster recovery virtual machine and the virtual machine to be reversely synchronized; a third determination unit for comparing based on the resource information The comparison result determines the difference resource information between the disaster recovery virtual machine and the to-be-reverse-synchronized virtual machine; the adjustment unit is used to adjust the resources of the reverse-synchronized virtual machine according to the difference resource information.
  • the device of the embodiment of the present invention it is possible to synchronize the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site when the production data center is running normally, and then monitor the production When the data center fails, the first production service on the production site is switched to the disaster recovery site.
  • the technical solution adopts a data synchronization solution under multiple data centers, that is, the production data center is synchronized to the disaster recovery data center, thus improving the security and reliability of the data and meeting the cloud platform data. Safety and reliability requirements.
  • FIG. 3 is a schematic block diagram of a disaster recovery system according to an embodiment of the present invention.
  • the disaster recovery system 300 includes a production data center 310 and a disaster recovery data center 320.
  • the production data center 310 has a production site
  • the disaster recovery data center 320 has a disaster recovery site
  • the production data center 310 and the disaster recovery data center 320 each include: a disaster recovery module DRM, which is used in the normal operation state of the production data center Next, synchronize the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site.
  • DRM disaster recovery module
  • the resource operation system iROS is used to switch the first production business of the production site to the disaster recovery site when a production data center operation failure is detected.
  • the storage device is used to store the first configuration information of the first virtual machine.
  • the disaster recovery module DRM is also used to reverse synchronize the second configuration information of the disaster recovery virtual machine on the disaster recovery site to the pending reverse synchronization on the production site when the production data center is restored to operation virtual machine.
  • the resource operation system iROS is also used to switch the second production business of the disaster recovery site to the production site.
  • the storage device is also used to store the second configuration information of the disaster recovery virtual machine.
  • the system of the embodiment of the present invention can synchronize the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site under the normal operation state of the production data center, and then monitor the production When the data center fails, the first production service on the production site is switched to the disaster recovery site.
  • the technical solution adopts a data synchronization solution under multiple data centers, that is, the production data center is synchronized to the disaster recovery data center, thus improving the security and reliability of the data and meeting the cloud platform data. Safety and reliability requirements.
  • the embodiments of the present application also provide a disaster recovery device, as shown in FIG. 4.
  • Disaster recovery devices may have relatively large differences due to different configurations or performances, and may include one or more processors 401 and memory 402, and one or more storage applications or data may be stored in the memory 402.
  • the memory 402 may be short-term storage or persistent storage.
  • the application program stored in the memory 402 may include one or more modules (not shown in the figure), and each module may include a series of computer-executable instructions in the disaster recovery device.
  • the processor 401 may be configured to communicate with the memory 402 and execute a series of computer-executable instructions in the memory 402 on the disaster recovery device.
  • the disaster recovery device may also include one or more power supplies 403, one or more wired or wireless network interfaces 404, one or more input and output interfaces 405, and one or more keyboards 406.
  • the disaster recovery device includes a memory and one or more programs, where one or more programs are stored in the memory, and one or more programs may include one or more modules, and each The module may include a series of computer-executable instructions in the disaster recovery equipment, and is configured to be executed by one or more processors.
  • the one or more programs include computer-executable instructions for performing the following operations in the production data center: In the state of, synchronize the first configuration information of the first virtual machine on the production site to the virtual machine to be synchronized on the disaster recovery site; when the operation failure of the production data center is detected, switch the first production business of the production site to Disaster recovery site.
  • the processor may also cause the processor to: after switching the production service of the production site to the disaster recovery site, when it is detected that the production data center is restored During operation, the second configuration information of the disaster recovery virtual machine on the disaster recovery site is reverse synchronized to the virtual machine to be reverse synchronized on the production site; the second production service of the disaster recovery site is switched to The production site.
  • the processor when the computer-executable instructions are executed, the processor may also: determine the virtual machine to be synchronized; and synchronize the first configuration information to the determined virtual machine to be synchronized.
  • the processor may also: obtain a preset virtual machine list of the disaster recovery system; and, obtain a synchronized virtual machine corresponding to the first virtual machine List; update the synchronized virtual machine list according to the preset virtual machine list; add the virtual machines in the synchronized virtual machine list to the virtual machine list to be synchronized to obtain the virtual machine to be synchronized.
  • the processor when the computer-executable instructions are executed, may also be caused to: add a virtual machine that exists in the preset virtual machine list and does not exist in the synchronized virtual machine list to all In the synchronized virtual machine list; delete virtual machines that do not exist in the preset virtual machine list and exist in the synchronized virtual machine list from the synchronized virtual machine list.
  • the processor when the computer-executable instructions are executed, the processor may also: determine the virtual machine to be synchronized in reverse; and synchronize the second configuration information to the determined to be reversed Synchronize the virtual machine.
  • the processor may also: obtain a preset virtual machine list of the disaster recovery system; and, obtain reverse synchronization corresponding to the disaster recovery virtual machine Virtual machine list; update the reverse synchronization virtual machine list according to the preset virtual machine list; add the virtual machines in the reverse synchronization virtual machine list to the virtual machine list to be reversely synchronized to obtain the pending synchronization machine list Synchronize the virtual machine.
  • the first configuration information includes resource configuration information of the first virtual machine;
  • the resource configuration information includes site information, site pair information, protection group information, virtual machine information in the protection group, and CPU At least one of information, memory information, and network card information; when computer-executable instructions are executed, the processor may also be caused to: trigger the disaster recovery site according to the first virtual machine according to a first preset frequency Resource configuration information adjusts the resources of the virtual machine to be synchronized; monitors whether the resource adjustment information of the resource configuration information by the disaster recovery site is successful; if not, triggers the disaster recovery site again according to the first virtual machine Resource configuration information for resource adjustment of the virtual machine to be synchronized.
  • the first configuration information includes disk configuration information of the first virtual machine;
  • the disk configuration information includes disk operation information, disk snapshot information, disk snapshot recovery information, virtual machine clone information, virtual machine At least one of backup information and virtual machine backup and recovery information; when computer-executable instructions are executed, the processor may also be caused to: trigger the disaster recovery site to check the location based on the disk configuration information of the first virtual machine The virtual machine to be synchronized is adjusted for resources.
  • the processor may also: compare resource information of the disaster recovery virtual machine and the virtual machine to be reversely synchronized; according to the resource information The comparison result of the comparison determines the difference resource information between the disaster recovery virtual machine and the virtual machine to be reversely synchronized; and adjusts the resource of the virtual machine to be reversely synchronized according to the differential resource information.
  • FIG. 5 shows a schematic structural diagram of a disaster recovery system according to an embodiment of the present invention.
  • the disaster recovery system includes a production data center and a disaster recovery data center.
  • the production data center and disaster recovery data center are in a master-slave relationship.
  • the production data center and the disaster recovery data center each include the resource operation system iROS, disaster recovery management DRM, resource pools (including common resource pools and disaster recovery resource pools) and storage libraries (including general storage pools and disaster recovery storage pools) ).
  • the resource operation system iROS implements disaster recovery management, that is, it can implement disaster recovery configuration, disaster recovery drills, and disaster recovery switching through the iROS operation management portal.
  • Disaster recovery management DRM includes TECS, unified elastic computing system iECS and virtualization management center VMC.
  • TECS is based on the open source KVM (Keyboard Video Video) virtualization technology, and has been enhanced in performance and real-time performance.
  • Virtualization products are used to provide virtualization management functions such as life cycle management of virtual machines, cluster management, dynamic resource scheduling, and dynamic energy consumption management.
  • the unified elastic computing system iECS and virtualization management center VMC are used to synchronize resources between production data centers and disaster recovery data centers.
  • Disaster recovery management DRM is the data center DC (DataCenter) at each site.
  • the storage library is used to store the resource data of each data center, and relies on the storage of its own data synchronization technology, which can realize the single data replication function from the LUN device in the production data center to the LUN device in the disaster recovery data center. Through the data replication function between the repositories, data synchronization between the production data center and the disaster recovery data center is achieved.
  • An embodiment of the present application also provides a computer-readable storage medium that stores one or more programs, and the one or more programs include instructions that are executed by an electronic device that includes multiple application programs Can enable the electronic device to execute the disaster recovery method described above, and is specifically used to perform: synchronizing the first configuration information of the first virtual machine on the production site to the disaster recovery site under the normal operation of the production data center The virtual machine to be synchronized; when the operation failure of the production data center is detected, the first production business of the production site is switched to the disaster recovery site.
  • the system, device, module or unit explained in the above embodiments may be specifically implemented by a computer chip or entity, or implemented by a product having a certain function.
  • a typical implementation device is a computer.
  • the computer may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
  • An embodiment of the present application also proposes a computer program product.
  • the computer program product includes a computer program stored on a non-transitory computer-readable storage medium.
  • the computer program includes program instructions. When the program instructions are executed by a computer When, the computer is caused to execute the method in any of the above method embodiments.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, the present application may take the form of a computer program product implemented on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer usable program code.
  • computer usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer readable memory that can guide a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer readable memory produce an article of manufacture including an instruction device, the instructions
  • the device implements the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device, so that a series of operating steps are performed on the computer or other programmable device to produce computer-implemented processing, which is executed on the computer or other programmable device
  • the instructions provide steps for implementing the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams.
  • the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory, random access memory (RAM) and/or non-volatile memory in a computer-readable medium, such as read only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer readable media including permanent and non-permanent, removable and non-removable media, can store information by any method or technology.
  • the information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
  • computer-readable media does not include temporary computer-readable media (transitory media), such as modulated data signals and carrier waves.
  • the application can be described in the general context of computer-executable instructions executed by a computer, such as program modules.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • the present application may also be practiced in distributed computing environments in which remote processing devices connected through a communication network perform tasks.
  • program modules may be located in local and remote computer storage media including storage devices.

Abstract

本申请实施例公开了一种容灾方法、装置及系统,用以实现虚拟化云平台中多数据中心下的数据备份模式,从而提高云平台数据的安全可靠性。所述方法包括:在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机;当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。

Description

容灾方法、装置及系统
交叉引用
本发明要求在2018年12月29日提交至中国专利局、申请号为201811641851.9、发明名称为“容灾方法、装置及系统”的中国专利申请的优先权,该申请的全部内容通过引用结合在本发明中。
技术领域
本发明涉及计算机容灾领域,尤其涉及一种容灾方法、装置及系统。
背景技术
虚拟化云平台采用的传统虚拟机备份方式为,在一个数据中心下对虚拟机进行全量、增量备份。显然,这种备份模式由于在一个数据中心下进行,因此安全性较低,已无法云平台数据的安全可靠性的要求。
发明内容
为解决上述技术问题,本申请实施例是这样实现的:
一方面,本申请实施例提供一种容灾方法,应用于容灾系统,所述容灾系统包括生产数据中心和容灾数据中心,所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述方法包括:在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
另一方面,本申请实施例提供一种容灾装置,应用于容灾系统,所述容灾系统包括生产数据中心和容灾数据中心,所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述装置包括:第一同步模块,用于在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;第一切换模块,用于当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
再一方面,本申请实施例提供一种容灾系统,包括生产数据中心和容灾数据中心;所述生产数据中心创建有生产站点,所述容灾数据中心创建有容 灾站点;所述生产数据中心和所述容灾数据中心各包括:容灾模块DRM,用于在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;资源运营系统iROS,用于当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点;存储设备,用于存储所述所述第一虚拟机的第一配置信息。
再一方面,本申请实施例提供一种容灾设备,应用于容灾系统,所述容灾系统包括生产数据中心和容灾数据中心,所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述设备包括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
再一方面,本申请实施例提供一种存储介质,用于存储计算机可执行指令,所述可执行指令在被执行时实现以下流程:在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机;当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
再一方面,本申请实施例提供一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行以上各个方面所述的方法。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据本发明一实施例的一种容灾方法的示意性流程图;
图2是根据本发明一实施例的一种容灾装置的示意性框图;
图3是根据本发明一实施例的一种容灾系统的示意性框图;
图4是根据本发明一实施例的一种容灾设备的示意性框图;
图5是根据本发明一实施例的一种容灾系统的示意性结构图。
具体实施方式
本申请实施例提供一种容灾方法、装置及系统,用以实现虚拟化云平台中多数据中心下的数据备份模式,从而提高云平台数据的安全可靠性。
为了使本技术领域的人员更好地理解本申请中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
图1是根据本发明一实施例的一种容灾方法的示意性流程图,该方法应用于容灾系统,容灾系统包括生产数据中心和容灾数据中心,生产数据中心创建有生产站点,容灾数据中心创建有容灾站点。如图1所示,容灾方法包括:S102,在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机。
S104,当监测到生产数据中心运行故障时,将生产站点的第一生产业务切换至容灾站点。
图2示出了容灾方法适用的容灾系统,如图2所示,容灾系统包括:采用本发明实施例的技术方案,能够在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机,进而在监测到生产数据中心运行故障时,将生产站点上的第一生产业务切换至容灾站点。可见,该技术方案在容灾过程中,采用的是在多个数据中心下的数据同步方案,即将生产数据中心同步至容灾数据中心,因此提高了数据的安全可靠性,满足了云平台数据的安全可靠性的要求。尤其是对某些行业如银行、保险等对数据安全可靠性要求较高的相关金融行业而言,该技术方案由于提高了云平台数据的安全可靠性,因此能够在很大程度上满足这些行业对数据安全可靠性的要求。
在一个实施例中,在将生产站点的生产业务切换至容灾站点之后,当监测到生产数据中心恢复运行时,将容灾站点上的容灾虚拟机的第二配置信息反向同步至生产站点上的待反向同步虚拟机,进而将容灾站点的第二生产业 务切换至生产站点。
上述实施例中,在执行S102之前,需首先进行容灾系统中的容灾配置,具体配置内容包括:
(1)首先,在生产站点和容灾站点分别部署一套iROS(Resource Operating System,资源运营系统),两套iROS是主用和灾备关系,其中,生产站点的iROS为主用,容灾站点的iROS为灾备。其次,对生产iROS(即生产站点的iROS)及容灾iROS(即容灾站点的iROS)的数据库进行配置,实现将生产iROS的数据库中的数据实时复制到容灾IROS的数据库中。
(2)在生产站点及容灾站点的存储设备上各划分一个相同大小的LUN(Logical Unit Number,逻辑单元号)设备,两个LUN设备之间配置为同步或异步复制关系。
(3)在iROS运营管理门户上,使用上述LUN设备分别为生产站点及容灾站点创建生产存储库和容灾存储库,并将创建好的生产存储库添加到生产站点的资源池上,将创建好的容灾存储库添加到容灾站点的资源池上。
(4)租户在关联的生产、容灾站点上均配置有vlan(Virtual Local Area Network,虚拟局域网)类型的端口组网络。
(5)iROS容灾管理。
在iROS容灾管理中,可对站点管理、站点对管理及保护组管理分别进行配置。具体可包括以下内容:a、在站点管理中分别创建生产站点及容灾站点,且生产iROS及容灾iROS分别将新增的站点信息发送给对应的DRM(Disaster Recovery Management,容灾管理)。
例如,创建站点“126-DataCenter-双活”、“102-DataCenter-异地灾备”、“103-DataCenter-同城灾备”等,在站点创建界面上,可显示所创建的各站点的鉴权url、鉴权用户名及在线状态等信息。
b、在站点对管理中创建生产站点及容灾站点的站点对,并配置租户在生产站点及容灾站点上的网络映射关系。
例如,创建站点对“126-DataCenter-双活”及“102-DataCenter-异地灾备”,并配置租户在站点对“126-DataCenter-双活”及“102-DataCenter-异地灾备”上的网络映射关系;再例如,创建站点对“126-DataCenter-双活”及“103-DataCenter-同城灾备”,并配置租户在站点对“126-DataCenter-双活”及“103-DataCenter-同城灾备”上的网络映射关系。在站点对创建界面上,可显示所创建的站点对及网络映射关系等信息。
c、保护组管理。保护组是容灾管理中生产站点及容灾站点进行切换时的最小操作单元,保护组中的一组虚拟机在站点切换时同时进行切换或回切。创建保护组时需选择站点对、生产站点及容灾站点的资源池、用于容灾使用的存储库,并将需要容灾的虚拟机从未容灾虚拟机添加至容灾虚拟机列表中。
此外,在保护组创建成功后,生产iROS及容灾iROS分别将保护组的创建信息发送给对应的DRM。
例如,在保护组管理界面上,创建保护组的名称为“g2-126-103”,选择的站点对为:“126-DataCenter-双活”及“103-DataCenter-同城灾备”站点对。其中,生产站点“126-DataCenter-双活”的资源池为“pool2-华翔机房”,存储库为“FC-MASTER1”;容灾站点“103-DataCenter-同城灾备”的资源池为“pool2-同城灾备”,存储库为“ibm-fc-s1”。
d、保护组管理中启用保护组。生产站点的虚拟机在启用容灾功能后,系统在容灾站点自动创建对应的容灾虚拟机。
上述详细说明了对容灾系统的容灾配置内容,在容灾配置完成后,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机。其中,第一配置信息包括第一虚拟机的资源配置信息及磁盘配置信息,资源配置信息包括站点信息、站点对信息、保护组信息、保护组中的虚拟机信息、CPU信息、内存信息、网卡信息等,磁盘配置信息包括磁盘操作信息、磁盘快照信息、磁盘快照恢复信息、虚拟机克隆信息、虚拟机备份信息、虚拟机备份恢复信息等。
在一个实施例中,DRM定时(如按照预设频率)比较生产站点及容灾站点的虚拟机配置信息,当发现容灾站点上的虚拟机配置信息与生产站点上的虚拟机配置信息不一致时,触发容灾站点根据生产站点的第一虚拟机的资源配置信息对待同步虚拟机进行资源调整。进而,监测容灾站点对资源配置信息的资源调整是否成功;若不成功,则再次触发容灾站点根据第一虚拟机的资源配置信息对待同步虚拟机进行资源调整。
在一个实施例中,DRM实时触发容灾站点根据生产站点的第一虚拟机的磁盘配置信息对待同步虚拟机进行资源调整。
本实施例中,通过定时比较生产站点及容灾站点上的虚拟机配置信息,并及时对容灾站点上与生产站点不一致的虚拟机配置信息进行相应的资源调整,使得容灾站点与生产站点上的虚拟机配置信息能够保持一致性,确保站点切换时数据的一致性。
在一个实施例中,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机时,需首先确定待同步虚拟机,进而将第一配置信息同步至被确定的待同步虚拟机。
其中,确定待同步虚拟机的步骤如下:首先,获取容灾系统的预设虚拟机列表,以及获取第一虚拟机对应的同步虚拟机列表;其次,根据预设虚拟机列表更新同步虚拟机列表;再次,将同步虚拟机列表中的虚拟机添加至待同步虚拟机列表,得到待同步虚拟机。
下面,基于上述实施例中容灾系统的容灾配置内容详细说明如何获取待同步虚拟机。
首先,查找容灾系统的保护组列表,并判断保护组列表是否为空。由于上述实施例中已配置了容灾系统的保护组列表,因此此处所查找到的保护组列表不为空。若保护组列表不为空,则初始化待同步虚拟机列表。
然后,遍历保护组,判断生产站点的状态是否为“保护中”(即保护组是否启用),若生产站点的状态为“保护中”,则进一步判断容灾站点的状态是否为“保护中”。若容灾站点的状态也为“保护中”,则获取保护组的虚拟机列表(即预设虚拟机列表),并获取生产站点的第一虚拟机对应的同步虚拟机列表,进而根据保护组的虚拟机列表更新同步虚拟机列表,并将同步虚拟机列表中的记录添加至待同步虚拟机列表中。
具体的,根据保护组的虚拟机列表更新同步虚拟机列表时,可包括以下三种方式:一、对于保护组的虚拟机列表(即预设虚拟机列表)中存在、且同步虚拟机列表中不存在的虚拟机,将其添加至同步虚拟机列表中;二、对于保护组的虚拟机列表中不存在、且同步虚拟机列表中存在的虚拟机,将其从同步虚拟机列表中删除;三、对于保护组的虚拟机列表中存在、且同步虚拟机列表中也存在的虚拟机,不作操作。
同步虚拟机列表的更新过程决定了当前的同步状态,例如,若向同步虚拟机列表中添加虚拟机,则当前同步状态为“新加入”;若从同步虚拟机列表中删除虚拟机,则当前同步状态为“待删除”或“删除中”;若更新操作已完成,则当前同步状态为“上次任务成功”;等等。
在确定待同步虚拟机之后,将生产站点上的第一虚拟机的第一配置信息同步至所确定的待同步虚拟机。以下详细说明如何将生产站点上的第一虚拟机的第一配置信息同步至所确定的待同步虚拟机。
首先遍历待同步虚拟机列表。
在遍历过程中,若当前同步状态为“新加入”,则调用当前站点的VMC(Virtual Management Center,虚拟化管理中心)接口查询所要加入的虚拟机详细信息,然后重新调用目的站点的接口进行虚拟机创建,并将当前同步状态更新为“创建中”,然后继续遍历待同步虚拟机列表。若当前同步状态不为“新加入”,则进一步判断当前同步状态是否为“待删除”。
在遍历过程中,若当前同步状态为“待删除”,则调用目的站点的接口删除待删除的虚拟机。若接口返回成功,则将虚拟机从缓存中移除,并将当前同步状态更改为“删除中”,然后继续遍历待同步虚拟机列表。若当前同步状态不为“待删除”,则进一步判断当前同步状态是否为“删除中”。
在遍历过程中,若当前同步状态为“删除中”,则调用目的站点的接口查询所删除的虚拟机信息,若查询到该虚拟机仍存在,则将当前同步状态更改为“待删除”,并调用目的站点的接口删除待删除的虚拟机。若查询结果为虚拟机不存在,则说明虚拟机已被成功删除,此时将当前同步状态更改为“已删除”,然后继续遍历待同步虚拟机列表。若当前同步状态不为“删除中”,则进一步判断当前同步状态是否为“上次任务成功”或“没有变化”。
在遍历过程中,若当前同步状态为“上次任务成功”或“没有变化”,则调用当前站点的VMC接口查询虚拟机详细信息,并判断所查询到的虚拟机信息与当前站点缓存中的虚拟机信息是否相同;若不同,则调用目的站点的接口调整虚拟机资源,并在接口返回成功时,将当前同步状态更改为“资源调整中”,然后继续遍历待同步虚拟机列表;若接口返回不成功,则继续遍历待同步虚拟机列表;若所查询到的虚拟机信息与当前站点缓存中的虚拟机信息相同,则将当前同步状态更改为“没有变化”,然后继续遍历待同步虚拟机列表。
在遍历过程中,若当前同步状态均不属于“新加入”、“待删除”、“删除中”、“上次任务成功”及“没有变化”,则调用目的站点的接口查询虚拟机详细信息,并判断所查询到的虚拟机信息与当前站点缓存中的虚拟机信息是否相同。当所查询到的虚拟机信息与当前站点缓存中的虚拟机信息不同时,进一步判断当前同步状态是否为“创建中”,若当前同步状态不为“创建中”,则进一步判断当前同步状态是否为“资源调整中”,若当前同步状态为“资源调整中”,则调用当前站点的VMC接口查询虚拟机详细信息,并将目的站点的虚拟机信息与当前站点的虚拟机信息进行比对,以确定是否一致;若一致,则将当前同步状态更改为“没有变化”,若不一致,则调用 目的站点接口进行资源调整,并将当前同步状态更改为“资源调整中”,然后继续遍历待同步虚拟机列表。若当前同步状态不为“资源调整中”,则继续遍历待同步虚拟机列表。若当前同步状态为“创建中”,则调用当前站点的VMC接口查询虚拟机详细信息,并重新调用目的站点接口进行虚拟机创建,当接口返回成功时,将当前同步状态更改为“创建中”,然后继续遍历待同步虚拟机列表。
在上述资源同步的过程中,当前站点即为生产站点,目的站点即为容灾站点。
在将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机的过程中,监测生产数据中心的运行状态,若生产数据中心运行故障,则将将生产站点的第一生产业务切换至容灾站点。
在进行站点切换时,首先确定生产站点的保护虚拟机已全部关闭,然后停止生产站点及容灾站点中LUN设备的数据复制关系,即停止将生产站点上的第一配置信息同步至容灾站点上的过程,此时允许容灾站点对LUN设备的数据复制关系中的辅助卷进行读写访问。然后,通过iROS运营管理门户打开容灾管理,在保护组管理中先停用保护组,再进行生产站点及容灾站点的切换。在站点切换过程中,iROS发送消息至容灾站点的DRM,并在与DRM的接口中通知DRM执行容灾切换,DRM随后启动容灾虚拟机,完成站点切换。
在一个实施例中,将容灾站点上的容灾虚拟机的第二配置信息反向同步至生产站点上的待反向同步虚拟机时,需首先确定待反向同步虚拟机,进而将第二配置信息同步至被确定的待反向同步虚拟机。
其中,确定待反向同步虚拟机的步骤如下:首先,获取容灾系统的预设虚拟机列表,以及获取容灾虚拟机对应的反向同步虚拟机列表;其次,根据预设虚拟机列表更新反向同步虚拟机列表;再次,将反向同步虚拟机列表中的虚拟机添加至待反向同步虚拟机列表,得到待反向同步虚拟机。
下面,基于上述实施例中容灾系统的容灾配置内容详细说明如何获取待反向同步虚拟机。
首先,查找容灾系统的保护组列表,并判断保护组列表是否为空。由于上述实施例中已配置了容灾系统的保护组列表,因此此处所查找到的保护组列表不为空。若保护组列表不为空,则初始化待反向同步虚拟机列表。
然后,遍历保护组,判断容灾站点的状态是否为“保护中”(即保护组 是否启用),若容灾站点的状态为“保护中”,则进一步判断生产站点的状态是否为“保护中”。若生产站点的状态也为“保护中”,则获取保护组的虚拟机列表(即预设虚拟机列表),并获取容灾虚拟机对应的反向同步虚拟机列表,进而根据保护组的虚拟机列表更新反向同步虚拟机列表,并将反向同步虚拟机列表中的记录添加至待反向同步虚拟机列表中。
具体的,根据保护组的虚拟机列表更新反向同步虚拟机列表时,可包括以下三种方式:一、对于保护组的虚拟机列表(即预设虚拟机列表)中存在、且反向同步虚拟机列表中不存在的虚拟机,将其添加至反向同步虚拟机列表中;二、对于保护组的虚拟机列表中不存在、且同步虚拟机列表中存在的虚拟机,将其从反向同步虚拟机列表中删除;三、对于保护组的虚拟机列表中存在、且反向同步虚拟机列表中也存在的虚拟机,不作操作。
反向同步虚拟机列表的更新过程决定了当前的反向同步状态,例如,若向反向同步虚拟机列表中添加虚拟机,则当前反向同步状态为“新加入”;若从反向同步虚拟机列表中删除虚拟机,则当前反向同步状态为“待删除”或“删除中”;若更新操作已完成,则当前反向同步状态为“上次任务成功”;等等。
在确定待反向同步虚拟机之后,将容灾虚拟机的第二配置信息反向同步至所确定的待反向同步虚拟机。以下详细说明如何将容灾虚拟机的第二配置信息反向同步至所确定的待反向同步虚拟机。
首先遍历待反向同步虚拟机列表。
在遍历过程中,若当前反向同步状态为“新加入”或“待删除”,则继续遍历待反向同步虚拟机列表。若当前反向同步状态不为“新加入”或“待删除”,则进一步判断当前反向同步状态是否为“删除中”。
在遍历过程中,若当前反向同步状态为“删除中”,则调用目的站点的接口查询所删除的虚拟机信息,若查询到该虚拟机仍存在,则将当前同步状态更改为“待删除”,然后继续遍历待反向同步虚拟机列表。若查询结果为虚拟机不存在,则说明虚拟机已被成功删除,此时将当前反向同步状态更改为“已删除”,然后继续遍历待反向同步虚拟机列表。若当前反向同步状态不为“删除中”,则进一步判断当前反向同步状态是否为“创建中”。
在遍历过程中,若当前反向同步状态为“创建中”,则调用目的站点的接口查询虚拟机详细信息,并判断虚拟机在目的站点中是否创建完成,若已创建完成,则调用当前站点的VMC接口查询虚拟机详细信息,并判断所查 询到的虚拟机信息与当前站点缓存中的虚拟机信息是否一致,若一致,则将当前反向同步状态更改为“上次任务成功”,然后继续遍历待反向同步虚拟机列表;若不一致,则调用本地站点接口调整虚拟机资源,并将当前反向同步状态更改为“资源调整中”,然后继续遍历待反向同步虚拟机列表。若虚拟机在目的站点中未创建完成,则将当前反向同步状态更改为“新加入”,然后继续遍历待反向同步虚拟机列表。若当前反向同步状态不为“创建中”,则进一步判断当前反向同步状态是否为“资源调整中”。
在遍历过程中,若当前反向同步状态为“资源调整中”,则调用目的站点的接口及当前站点的VMC接口查询虚拟机详细信息,并判断目的站点的虚拟机信息与当前站点的虚拟机信息是否一致,若一致,则将当前反向同步状态更改为“上次任务成功”,然后继续遍历待反向同步虚拟机列表;若不一致,则调用本地站点接口调整虚拟机资源,并将当前反向同步状态更改为“资源调整中”,然后继续遍历待反向同步虚拟机列表。
若当前反向同步状态不为“资源调整中”,则调用目的站点的接口及当前站点的VMC接口查询虚拟机详细信息,并判断目的站点的虚拟机信息与当前站点的虚拟机信息是否一致,若一致,则继续遍历待反向同步虚拟机列表;若不一致,调用本地站点接口调整虚拟机资源,并将当前反向同步状态更改为“资源调整中”,然后继续遍历待反向同步虚拟机列表。
在上述资源同步的过程中,当前站点即为容灾站点,目的站点即为生产站点。
在一个实施例中,第二配置信息包括容灾虚拟机和待反向同步虚拟机之间的差异资源信息,因此,在将容灾站点上的容灾虚拟机的第二配置信息反向同步至生产站点上的待反向同步虚拟机时,可将容灾虚拟机和待反向同步虚拟机进行资源信息比对,并根据资源信息比对的比对结果,确定容灾虚拟机和待反向同步虚拟机之间的差异资源信息,进而根据差异资源信息对待反向同步虚拟机进行资源调整。
将容灾站点上的容灾虚拟机的第二配置信息反向同步至生产站点上的待反向同步虚拟机之后,将容灾站点的第二生产业务切换至生产站点。
在将容灾站点回切至生产站点时,需通过iROS运营管理门户打开容灾管理,并在保护组管理中将保护组进行回切。回切开始时保护组的状态显示为“回切中”,当资源同步完成后,保护组的状态变更为“已回切”,此时重新启用保护组。
综上,已经对本主题的特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作可以按照不同的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序,以实现期望的结果。在某些实施方式中,多任务处理和并行处理可以是有利的。
以上为本申请实施例提供的容灾方法,基于同样的思路,本申请实施例还提供一种容灾装置。
图2是根据本发明一实施例的一种容灾装置的示意性框图,该装置应用于容灾系统,容灾系统包括生产数据中心和容灾数据中心,生产数据中心创建有生产站点,容灾数据中心创建有容灾站点。如图2所示,容灾装置200包括:第一同步模块210,用于在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机。
第一切换模块220,用于当监测到生产数据中心运行故障时,将生产站点的第一生产业务切换至容灾站点。
在一个实施例中,装置200还包括:第二同步模块,用于在将生产站点的生产业务切换至容灾站点之后,当监测到生产数据中心恢复运行时,将容灾站点上的容灾虚拟机的第二配置信息反向同步至生产站点上的待反向同步虚拟机;第二切换模块,用于将容灾站点的第二生产业务切换至生产站点。
在一个实施例中,第一同步模块210包括:第一确定单元,用于确定待同步虚拟机;第一同步单元,用于将第一配置信息同步至被确定的待同步虚拟机;其中,第一确定单元用于:获取容灾系统的预设虚拟机列表;及,获取第一虚拟机对应的同步虚拟机列表;根据预设虚拟机列表更新同步虚拟机列表;将同步虚拟机列表中的虚拟机添加至待同步虚拟机列表,得到待同步虚拟机。
在一个实施例中,第一确定单元还用于:将预设虚拟机列表中存在、且同步虚拟机列表中不存在的虚拟机添加至同步虚拟机列表中;将预设虚拟机列表中不存在、且同步虚拟机列表中存在的虚拟机从同步虚拟机列表中删除。
在一个实施例中,第二同步模块包括:第二确定单元,用于确定待反向同步虚拟机;第二同步单元,用于将第二配置信息同步至被确定的待反向同步虚拟机;其中,第二确定单元用于:获取容灾系统的预设虚拟机列表;及,获取容灾虚拟机对应的反向同步虚拟机列表;根据预设虚拟机列表更新反向同步虚拟机列表;将反向同步虚拟机列表中的虚拟机添加至待反向同步虚拟 机列表,得到待反向同步虚拟机。
在一个实施例中,第一配置信息包括第一虚拟机的资源配置信息;资源配置信息包括站点信息、站点对信息、保护组信息、保护组中的虚拟机信息、CPU信息、内存信息、网卡信息中的至少一项;第一同步模块210包括:第一触发单元,用于按照第一预设频率触发容灾站点根据第一虚拟机的资源配置信息对待同步虚拟机进行资源调整;监测单元,用于监测容灾站点对资源配置信息的资源调整是否成功;第二触发单元,用于若容灾站点对资源配置信息的资源调整未成功,则再次触发容灾站点根据第一虚拟机的资源配置信息对待同步虚拟机进行资源调整。
在一个实施例中,第一配置信息包括第一虚拟机的磁盘配置信息;磁盘配置信息包括磁盘操作信息、磁盘快照信息、磁盘快照恢复信息、虚拟机克隆信息、虚拟机备份信息、虚拟机备份恢复信息中的至少一项;第一同步模块210包括:第三触发单元,用于触发容灾站点根据第一虚拟机的磁盘配置信息对待同步虚拟机进行资源调整。
在一个实施例中,第二同步模块用于:比对单元,用于将容灾虚拟机和待反向同步虚拟机进行资源信息比对;第三确定单元,用于根据资源信息比对的比对结果,确定容灾虚拟机和待反向同步虚拟机之间的差异资源信息;调整单元,用于根据差异资源信息对待反向同步虚拟机进行资源调整。
采用本发明实施例的装置,能够在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机,进而在监测到生产数据中心运行故障时,将生产站点上的第一生产业务切换至容灾站点。可见,该技术方案在容灾过程中,采用的是在多个数据中心下的数据同步方案,即将生产数据中心同步至容灾数据中心,因此提高了数据的安全可靠性,满足了云平台数据的安全可靠性的要求。
本领域的技术人员应可理解,上述容灾装置能够用来实现前文所述的容灾方法,其中的细节描述应与前文方法部分描述类似,为避免繁琐,此处不另赘述。
图3是根据本发明一实施例的一种容灾系统的示意性框图,如图3所示,容灾系统300包括生产数据中心310和容灾数据中心320。
其中,生产数据中心310创建有生产站点,容灾数据中心320创建有容灾站点;生产数据中心310和容灾数据中心320各包括:容灾模块DRM,用于在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配 置信息同步至容灾站点上的待同步虚拟机。
资源运营系统iROS,用于当监测到生产数据中心运行故障时,将生产站点的第一生产业务切换至容灾站点。
存储设备,用于存储第一虚拟机的第一配置信息。
在一个实施例中,容灾模块DRM还用于当监测到生产数据中心恢复运行时,将容灾站点上的容灾虚拟机的第二配置信息反向同步至生产站点上的待反向同步虚拟机。
资源运营系统iROS还用于将容灾站点的第二生产业务切换至生产站点。
存储设备,还用于存储容灾虚拟机的第二配置信息。
采用本发明实施例的系统,能够在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机,进而在监测到生产数据中心运行故障时,将生产站点上的第一生产业务切换至容灾站点。可见,该技术方案在容灾过程中,采用的是在多个数据中心下的数据同步方案,即将生产数据中心同步至容灾数据中心,因此提高了数据的安全可靠性,满足了云平台数据的安全可靠性的要求。
基于同样的思路,本申请实施例还提供一种容灾设备,如图4所示。容灾设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上的处理器401和存储器402,存储器402中可以存储有一个或一个以上存储应用程序或数据。其中,存储器402可以是短暂存储或持久存储。存储在存储器402的应用程序可以包括一个或一个以上模块(图示未示出),每个模块可以包括对容灾设备中的一系列计算机可执行指令。在一实施方式中,处理器401可以设置为与存储器402通信,在容灾设备上执行存储器402中的一系列计算机可执行指令。容灾设备还可以包括一个或一个以上电源403,一个或一个以上有线或无线网络接口404,一个或一个以上输入输出接口405,一个或一个以上键盘406。
具体在本实施例中,容灾设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对容灾设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机;当监测到生产数据中心运行故障时,将生产站点的第一生 产业务切换至容灾站点。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:在将所述生产站点的生产业务切换至所述容灾站点之后,当监测到所述生产数据中心恢复运行时,将所述容灾站点上的容灾虚拟机的第二配置信息反向同步至所述生产站点上的待反向同步虚拟机;将所述容灾站点的第二生产业务切换至所述生产站点。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:确定所述待同步虚拟机;将所述第一配置信息同步至被确定的所述待同步虚拟机。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:获取所述容灾系统的预设虚拟机列表;及,获取所述第一虚拟机对应的同步虚拟机列表;根据所述预设虚拟机列表更新所述同步虚拟机列表;将所述同步虚拟机列表中的虚拟机添加至待同步虚拟机列表,得到所述待同步虚拟机。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:将所述预设虚拟机列表中存在、且所述同步虚拟机列表中不存在的虚拟机添加至所述同步虚拟机列表中;将所述预设虚拟机列表中不存在、且所述同步虚拟机列表中存在的虚拟机从所述同步虚拟机列表中删除。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:确定所述待反向同步虚拟机;将所述第二配置信息同步至被确定的所述待反向同步虚拟机。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:获取所述容灾系统的预设虚拟机列表;及,获取所述容灾虚拟机对应的反向同步虚拟机列表;根据所述预设虚拟机列表更新所述反向同步虚拟机列表;将所述反向同步虚拟机列表中的虚拟机添加至待反向同步虚拟机列表,得到所述待反向同步虚拟机。
在一实施方式中,所述第一配置信息包括所述第一虚拟机的资源配置信息;所述资源配置信息包括站点信息、站点对信息、保护组信息、保护组中的虚拟机信息、CPU信息、内存信息、网卡信息中的至少一项;计算机可执行指令在被执行时,还可以使所述处理器:按照第一预设频率触发所述容灾站点根据所述第一虚拟机的资源配置信息对所述待同步虚拟机进行资源调整;监测所述容灾站点对所述资源配置信息的资源调整是否成功;若否,则再次触发所述容灾站点根据所述第一虚拟机的资源配置信息对所述待同步虚 拟机进行资源调整。
在一实施方式中,所述第一配置信息包括所述第一虚拟机的磁盘配置信息;所述磁盘配置信息包括磁盘操作信息、磁盘快照信息、磁盘快照恢复信息、虚拟机克隆信息、虚拟机备份信息、虚拟机备份恢复信息中的至少一项;计算机可执行指令在被执行时,还可以使所述处理器:触发所述容灾站点根据所述第一虚拟机的磁盘配置信息对所述待同步虚拟机进行资源调整。
在一实施方式中,计算机可执行指令在被执行时,还可以使所述处理器:将所述容灾虚拟机和所述待反向同步虚拟机进行资源信息比对;根据所述资源信息比对的比对结果,确定所述容灾虚拟机和所述待反向同步虚拟机之间的差异资源信息;根据所述差异资源信息对所述待反向同步虚拟机进行资源调整。
图5示出了本发明实施例的一种容灾系统的示意性结构图。如图5所示,容灾系统包括生产数据中心和容灾数据中心。生产数据中心和容灾数据中心之间为主备关系。
其中,生产数据中心和容灾数据中心各包括资源运营系统iROS、容灾管理DRM、资源池(包括普通资源池和容灾用资源池)及存储库(包括普通存储库和容灾用存储库)。
资源运营系统iROS实现了容灾管理,即能够通过iROS运营管理门户实现容灾配置、容灾演练和容灾切换等操作。
容灾管理DRM中包括TECS、统一弹性计算系统iECS及虚拟化管理中心VMC,其中,TECS是以开源的KVM(Keyboard Video Mouse)虚拟化技术为基础,在性能和实时性等方面进行了增强的虚拟化产品,用于提供虚拟机的生命周期管理、集群管理、动态资源调度以及动态能耗管理等虚拟化管理功能。统一弹性计算系统iECS及虚拟化管理中心VMC用于实现生产数据中心及容灾数据中心之间的资源同步。容灾管理DRM为各站点的数据中心DC(DataCenter)。
存储库用于存储各数据中心的资源数据,且依赖于存储自身的数据同步技术,能够实现生产数据中心的LUN设备到容灾数据中心的LUN设备的单项数据复制功能。通过存储库之间的数据复制功能,实现了生产数据中心和容灾数据中心之间的数据同步。
本申请实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个 应用程序的电子设备执行时,能够使该电子设备执行上述容灾方法,并具体用于执行:在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机;当监测到生产数据中心运行故障时,将生产站点的第一生产业务切换至容灾站点。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。
本申请实施例还提出了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述任意方法实施例中的方法。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。

Claims (13)

  1. 一种容灾方法,其中,应用于容灾系统,所述容灾系统包括生产数据中心和容灾数据中心,所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述方法包括:
    在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;
    当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
  2. 根据权利要求1所述的方法,其中,在将所述生产站点的生产业务切换至所述容灾站点之后,所述方法还包括:
    当监测到所述生产数据中心恢复运行时,将所述容灾站点上的容灾虚拟机的第二配置信息反向同步至所述生产站点上的待反向同步虚拟机;
    将所述容灾站点的第二生产业务切换至所述生产站点。
  3. 根据权利要求1所述的方法,其中,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机,包括:
    确定所述待同步虚拟机;
    将所述第一配置信息同步至被确定的所述待同步虚拟机;
    其中,所述确定所述待同步虚拟机,包括:
    获取所述容灾系统的预设虚拟机列表;及,获取所述第一虚拟机对应的同步虚拟机列表;
    根据所述预设虚拟机列表更新所述同步虚拟机列表;
    将所述同步虚拟机列表中的虚拟机添加至待同步虚拟机列表,得到所述待同步虚拟机。
  4. 根据权利要求2所述的方法,其中,将所述容灾站点上的容灾虚拟机的第二配置信息反向同步至所述生产站点上的待反向同步虚拟机,包括:
    确定所述待反向同步虚拟机;
    将所述第二配置信息同步至被确定的所述待反向同步虚拟机;
    其中,所述确定所述待反向同步虚拟机,包括:
    获取所述容灾系统的预设虚拟机列表;及,获取所述容灾虚拟机对应的反向同步虚拟机列表;
    根据所述预设虚拟机列表更新所述反向同步虚拟机列表;
    将所述反向同步虚拟机列表中的虚拟机添加至待反向同步虚拟机列表,得到所述待反向同步虚拟机。
  5. 根据权利要求1所述的方法,其中,所述第一配置信息包括所述第一虚拟机的资源配置信息;所述资源配置信息包括站点信息、站点对信息、保护组信息、保护组中的虚拟机信息、CPU信息、内存信息、网卡信息中的至少一项;
    所述将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机,包括:
    按照第一预设频率触发所述容灾站点根据所述第一虚拟机的资源配置信息对所述待同步虚拟机进行资源调整;
    监测所述容灾站点对所述资源配置信息的资源调整是否成功;
    若否,则再次触发所述容灾站点根据所述第一虚拟机的资源配置信息对所述待同步虚拟机进行资源调整。
  6. 根据权利要求1所述的方法,其中,所述第一配置信息包括所述第一虚拟机的磁盘配置信息;所述磁盘配置信息包括磁盘操作信息、磁盘快照信息、磁盘快照恢复信息、虚拟机克隆信息、虚拟机备份信息、虚拟机备份恢复信息中的至少一项;
    所述将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机,包括:
    触发所述容灾站点根据所述第一虚拟机的磁盘配置信息对所述待同步虚拟机进行资源调整。
  7. 根据权利要求2所述的方法,其中,所述将所述容灾站点上的容灾虚拟机的第二配置信息反向同步至所述生产站点上的待反向同步虚拟机,包括:
    将所述容灾虚拟机和所述待反向同步虚拟机进行资源信息比对;
    根据所述资源信息比对的比对结果,确定所述容灾虚拟机和所述待反向同步虚拟机之间的差异资源信息;
    根据所述差异资源信息对所述待反向同步虚拟机进行资源调整。
  8. 一种容灾装置,其中,应用于容灾系统,所述容灾系统包括生产数据中心和容灾数据中心,所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述装置包括:
    第一同步模块,用于在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;
    第一切换模块,用于当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
  9. 根据权利要求8所述的装置,其中,所述装置还包括:
    第二同步模块,用于在将所述生产站点的生产业务切换至所述容灾站点之后,当监测到所述生产数据中心恢复运行时,将所述容灾站点上的容灾虚拟机的第二配置信息反向同步至所述生产站点上的待反向同步虚拟机;
    第二切换模块,用于将所述容灾站点的第二生产业务切换至所述生产站点。
  10. 一种容灾系统,其中,包括生产数据中心和容灾数据中心;所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述生产数据中心和所述容灾数据中心各包括:
    容灾模块DRM,用于在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;
    资源运营系统iROS,用于当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点;
    存储设备,用于存储所述所述第一虚拟机的第一配置信息。
  11. 根据权利要求10所述的系统,其中,所述容灾模块DRM,还用于当监测到所述生产数据中心恢复运行时,将所述容灾站点上的容灾虚拟机的第二配置信息反向同步至所述生产站点上的待反向同步虚拟机;
    所述资源运营系统iROS,还用于将所述容灾站点的第二生产业务切换至所述生产站点;
    所述存储设备,还用于存储所述所述容灾虚拟机的第二配置信息。
  12. 一种容灾设备,其中,应用于容灾系统,所述容灾系统包括生产数据中心和容灾数据中心,所述生产数据中心创建有生产站点,所述容灾数据中心创建有容灾站点;所述设备包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:
    在所述生产数据中心正常运行的状态下,将所述生产站点上的第一虚拟机的第一配置信息同步至所述容灾站点上的待同步虚拟机;
    当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
  13. 一种存储介质,用于存储计算机可执行指令,所述可执行指令在被执行时实现以下流程:
    在生产数据中心正常运行的状态下,将生产站点上的第一虚拟机的第一配置信息同步至容灾站点上的待同步虚拟机;
    当监测到所述生产数据中心运行故障时,将所述生产站点的第一生产业务切换至所述容灾站点。
PCT/CN2019/118577 2018-12-29 2019-11-14 容灾方法、装置及系统 WO2020134678A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811641851.9A CN111381931A (zh) 2018-12-29 2018-12-29 容灾方法、装置及系统
CN201811641851.9 2018-12-29

Publications (1)

Publication Number Publication Date
WO2020134678A1 true WO2020134678A1 (zh) 2020-07-02

Family

ID=71129674

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118577 WO2020134678A1 (zh) 2018-12-29 2019-11-14 容灾方法、装置及系统

Country Status (2)

Country Link
CN (1) CN111381931A (zh)
WO (1) WO2020134678A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112667153A (zh) * 2020-12-22 2021-04-16 军事科学院系统工程研究院网络信息研究所 一种基于分布式raid切片的多站点容灾备份方法
CN112596951A (zh) * 2020-12-24 2021-04-02 深圳市科力锐科技有限公司 一种nas数据容灾方法、装置、设备及存储介质
CN112860494A (zh) * 2021-02-25 2021-05-28 中国建设银行股份有限公司 一种数据中心切换方法及其相关设备
CN115426251B (zh) * 2022-08-30 2024-02-13 山东海量信息技术研究院 一种云主机的容灾方法、装置及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101815099A (zh) * 2010-04-20 2010-08-25 中兴通讯股份有限公司 双控磁盘阵列中双控制器配置信息同步的方法和装置
US20130315253A1 (en) * 2011-12-06 2013-11-28 Brocade Communications Systems, Inc. Lossless Connection Failover for Single Devices
CN103581177A (zh) * 2013-10-24 2014-02-12 华为技术有限公司 虚拟机管理方法及装置
CN104794028A (zh) * 2014-01-16 2015-07-22 中国移动通信集团浙江有限公司 一种容灾处理方法、装置、主用数据中心和备用数据中心
CN105740049A (zh) * 2016-01-27 2016-07-06 杭州华三通信技术有限公司 一种控制方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101815099A (zh) * 2010-04-20 2010-08-25 中兴通讯股份有限公司 双控磁盘阵列中双控制器配置信息同步的方法和装置
US20130315253A1 (en) * 2011-12-06 2013-11-28 Brocade Communications Systems, Inc. Lossless Connection Failover for Single Devices
CN103581177A (zh) * 2013-10-24 2014-02-12 华为技术有限公司 虚拟机管理方法及装置
CN104794028A (zh) * 2014-01-16 2015-07-22 中国移动通信集团浙江有限公司 一种容灾处理方法、装置、主用数据中心和备用数据中心
CN105740049A (zh) * 2016-01-27 2016-07-06 杭州华三通信技术有限公司 一种控制方法及装置

Also Published As

Publication number Publication date
CN111381931A (zh) 2020-07-07

Similar Documents

Publication Publication Date Title
WO2020134678A1 (zh) 容灾方法、装置及系统
US20200334113A1 (en) Heartbeat monitoring of virtual machines for initiating failover operations in a data storage management system, including operations by a master monitor node
US9411628B2 (en) Virtual machine cluster backup in a multi-node environment
US10152527B1 (en) Increment resynchronization in hash-based replication
US9280430B2 (en) Deferred replication of recovery information at site switchover
US20210117441A1 (en) Data replication system
US9727429B1 (en) Method and system for immediate recovery of replicated virtual machines
US8893147B2 (en) Providing a virtualized replication and high availability environment including a replication and high availability engine
WO2019152117A1 (en) Systems and methods for synchronizing microservice data stores
US20200026786A1 (en) Management and synchronization of batch workloads with active/active sites using proxy replication engines
JP2017528809A (ja) 記憶不具合後の安全なデータアクセス
US20140108345A1 (en) Exchanging locations of an out of synchronization indicator and a change recording indicator via pointers
KR20230097184A (ko) 비동기적 교차-영역 블록 볼륨 복제
US10223223B2 (en) Preventing non-detectable data loss during site switchover
CN102394923A (zh) 一种基于n×n陈列结构的云系统平台
WO2020143410A1 (zh) 数据存储方法及装置、电子设备、存储介质
US9792185B2 (en) Directed backup for massively parallel processing databases
US10509767B2 (en) Systems and methods for managing snapshots of a file system volume
Perkov et al. High-availability using open source software
US20200341857A1 (en) System and method for accelerating application service restoration
US9485308B2 (en) Zero copy volume reconstruction
US10127270B1 (en) Transaction processing using a key-value store
US9542277B2 (en) High availability protection for asynchronous disaster recovery
Verma et al. Big data analytics: performance evaluation for high availability and fault tolerance using mapreduce framework with hdfs
US9836515B1 (en) Systems and methods for adding active volumes to existing replication configurations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19905574

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19905574

Country of ref document: EP

Kind code of ref document: A1