CN111158949A - Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium - Google Patents

Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium Download PDF

Info

Publication number
CN111158949A
CN111158949A CN201811318053.2A CN201811318053A CN111158949A CN 111158949 A CN111158949 A CN 111158949A CN 201811318053 A CN201811318053 A CN 201811318053A CN 111158949 A CN111158949 A CN 111158949A
Authority
CN
China
Prior art keywords
available area
disaster recovery
cloud
ecs
production
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811318053.2A
Other languages
Chinese (zh)
Inventor
秦可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Chongqing Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Chongqing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Chongqing Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811318053.2A priority Critical patent/CN111158949A/en
Publication of CN111158949A publication Critical patent/CN111158949A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1489Generic software techniques for error detection or fault masking through recovery blocks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a configuration method, a switching method and a device of a disaster recovery architecture, equipment and a storage medium. The configuration method comprises the following steps: according to the specification of user requirements, respectively creating cloud hosts with the same specification in a production available area and a disaster recovery available area of a cloud disaster recovery architecture; respectively binding the same service address in a cloud host ECS of a production available area and a corresponding cloud host ECS 'of a disaster recovery available area, and binding a management address IP in the ECS and a management address IP' in the ECS; loading a first smart engine for application data synchronization in the ECS and ECS', respectively; and setting the service address and the IP of the ECS to be in an activated state, setting the service address of the ECS 'to be in an inactivated state, and setting the IP' to be in an activated state. The invention can create the cloud host consistent with the production environment for the user in the disaster tolerance environment, and bind the same service address in the corresponding cloud host respectively, and can realize the automatic seamless switching of the disaster tolerance without modifying the related codes or configuration files of the IP address.

Description

Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a configuration method of a disaster recovery architecture, a disaster recovery switching method and apparatus, a device, and a storage medium.
Background
At present, a double-active solution based on Global load balancing (GSLB) is mainly adopted for realizing disaster tolerance of a cloud service system, wherein the disaster tolerance means that applications and data of users are protected from being affected by faults and disasters, and continuous use is ensured. The solution is to bind cloud hosts (ECSs) of different available areas under a GSLB instance, and distribute and deploy Service system applications in the ECSs, so as to avoid unavailability of external services due to failure of a single available area.
In addition, data disaster recovery in the same city is realized through database services spanning multiple available areas, wherein the available areas refer to one or more data centers isolated from each other by infrastructures such as power and networks in the same area.
However, the business system adopts the disaster recovery architecture as the following preconditions: (1) public or private clouds can provide global load balancing; (2) a database service that can achieve data synchronization between different available areas; (3) the cloud-based business system must adopt a distributed + stateless architecture design.
IT should be noted that, although the distributed + stateless architecture is an excellent Information Technology (IT) system high-availability design solution, there are still a lot of IT systems adopting a traditional architecture that is not distributed or is stateful, and when such business systems are deployed in a public cloud or a private cloud (especially in the public cloud), IT is difficult to implement disaster recovery protection of the business systems based on a global load balancing dual-active solution.
In addition, a large number of service systems have Internet Protocol (IP) address dependencies when communicating internally. For example, in an application scenario where data transmission across servers is performed in a File Transfer Protocol (ftp) manner, in such a scenario, an IP address of at least one of two communicating parties is required to be fixed, otherwise, an IP address related code or a configuration File needs to be modified.
However, a dual-active disaster recovery system based on global load balancing often needs to deploy a service system in two available areas. Due to the inconsistency of the gateways of the two available areas, it is difficult to realize that the services of different available areas can use the same IP address when a disaster occurs, and at this time, the relevant codes or configurations of the IP addresses are often required to be changed to realize the normal operation of the service system.
In summary, a method is required for realizing automatic seamless switching of disaster tolerance and improving disaster tolerance switching efficiency without modifying IP address related codes or configuration files.
Disclosure of Invention
The embodiment of the invention provides a configuration method of a disaster recovery architecture, a disaster recovery switching method, a device thereof, equipment thereof and a storage medium, which can realize the deployment of a production environment and a disaster recovery environment and realize the synchronization of data of the production environment to the disaster recovery environment.
In a first aspect, an embodiment of the present invention provides a method for configuring a disaster recovery architecture, where the method includes:
according to the specification of user requirements, respectively creating cloud hosts with the same specification in a production available area and a disaster recovery available area of a cloud disaster recovery architecture;
respectively binding the same service address in a cloud host ECS of a production available area and a corresponding cloud host ECS 'of a disaster recovery available area, and binding a management address IP in the ECS and a management address IP' in the ECS;
loading a first smart engine for application data synchronization in the ECS and ECS', respectively;
and setting the service address and the IP of the ECS to be in an activated state, setting the service address of the ECS 'to be in an inactivated state, and setting the IP' to be in an activated state.
The configuration method of the disaster recovery architecture according to the present invention further comprises:
respectively creating cloud databases with the same specification in a production available area and a disaster recovery available area;
and respectively loading a second intelligent engine for data synchronization in the cloud database RDS of the production available area and the cloud database RDS' of the disaster tolerance available area.
The configuration method of the disaster recovery architecture on the cloud according to the present invention further includes:
and creating load balancing examples with the same specification in the production available area and the disaster recovery available area, setting the load balancing examples of the production available area to be in an activated state, and setting the load balancing examples of the disaster recovery available area to be in an inactivated state.
The configuration method of the disaster recovery architecture on the cloud according to the present invention further includes:
creating an elastic block storage EBS in a production available area, creating an elastic block storage EBS 'with the same specification in a disaster tolerance available area, completing storage mounting of the ECS and the RDS in the EBS, and completing storage mounting of the ECS' and the RDS 'in the EBS'.
According to the configuration method of the disaster recovery architecture on the cloud of the present invention, before the cloud hosts with the same specification are respectively created in the production available area and the disaster recovery available area of the disaster recovery architecture on the cloud according to the specification of the user requirement, the method further includes:
and according to the disaster recovery requirement of the to-be-cloud service system, a cloud service deployment scheme aiming at the production available area and the disaster recovery available area of the to-be-cloud service system in the cloud is formulated.
According to the configuration method of the disaster recovery architecture on the cloud of the present invention, after the cloud hosts with the same specification are respectively created in the production available area and the disaster recovery available area of the disaster recovery architecture on the cloud according to the specification of the user requirement, the method further includes:
and the configuration data is synchronously issued in the production available area and the disaster recovery available area to complete the synchronization of the configuration data between the ECS and the ECS'.
According to the configuration method of the disaster recovery architecture on the cloud of the present invention, after the synchronization of the configuration data between the ECS and the ECS' is completed by the manner of synchronously issuing the configuration data in the production available area and the disaster recovery available area, the method further includes:
adding a transmitting end mark and a receiving end management address in a first intelligent engine of the ECS, and adding a receiving end mark and a transmitting end management address in a first intelligent engine of the ECS';
a first intelligent engine of the ECS rewrites a data operation command and an Application Program Interface (API) of an operating system, generates a data operation log according to the data operation command and the API, and sends a data synchronization request to the ECS';
and the ECS 'verifies whether the originating address is an authorized address, establishes connection between the ECS and the ECS' after the verification is passed, and completes synchronization of the data operation log.
According to the configuration method of the disaster recovery architecture on the cloud of the present invention, after the cloud databases with the same specification are respectively created in the production available area and the disaster recovery available area, the method further includes:
adding a transmitting end mark and a receiving end management address in a second intelligent engine of the RDS, and adding a receiving end mark and a transmitting end management address in a second intelligent engine of the RDS';
a second intelligent engine of the RDS performs duplication on a data operation command and an application program interface API of database software, and generates a data execution log according to the data operation command and the API, and the RDS sends a data synchronization request to the RDS';
the RDS 'verifies whether the originating address is an authorized address, establishes connection between the RDS and the RDS' after the verification is passed, and completes synchronization of the data execution log.
According to the configuration method of the disaster recovery architecture on the cloud, after the load balancing instances with the same specification are created in the production available area and the disaster recovery available area, the method further includes:
completing load balancing strategy configuration on a load balancing example of a production available area, and reporting load balancing strategy configuration information to a Software Defined Network (SDN) controller of the production available area;
the SDN controller of the production available area generates a load balancing configuration metadata message according to the load balancing strategy configuration information, and synchronizes the metadata message to the SDN controller of the disaster tolerance available area;
and the SDN controller of the disaster recovery available area issues the metadata message to the load balancing example of the disaster recovery available area, and the production data synchronization between the production available area and the load balancing example of the disaster recovery available area is completed.
In a second aspect, an embodiment of the present invention provides a disaster recovery switching method, for a cloud disaster recovery architecture configured according to the method described above, including:
when the production available area is judged to be unavailable, the production environment identification of the production available area is cancelled, and the production environment identification is added to the disaster recovery available area;
adding a sending end mark and a receiving end management address in a second intelligent engine of a cloud database of a disaster recovery available area;
adding a sending end mark and a receiving end management address in a first intelligent engine of a cloud host of a disaster tolerance available area, informing a Software Defined Network (SDN) controller of the disaster tolerance available area, and activating a service address of the cloud host of the disaster tolerance available area;
and informing an SDN controller of the disaster tolerance available area, and activating a load balancing example of the disaster tolerance available area.
According to the disaster recovery switching method, after notifying the SDN controller of the disaster recovery available area to activate the load balancing instance of the disaster recovery available area, the method further includes:
when the production available area is recovered to be normal, informing an SDN controller of the production available area, setting a service address and a load balancing instance of a cloud host of the production available area to be in an inactive state, and adding a receiving end mark and a sending end management address in a second intelligent engine of a cloud database of the production available area and a first intelligent engine of the cloud host to realize data synchronization with the disaster recovery available area.
In a third aspect, an embodiment of the present invention provides a configuration device for a disaster recovery architecture on a cloud, where the configuration device includes:
the system comprises a first establishing module, a second establishing module and a third establishing module, wherein the first establishing module is used for respectively establishing cloud hosts with the same specification in a production available area and a disaster recovery available area of a disaster recovery framework on the cloud according to a user requirement specification;
the binding module is used for binding the same service address in the cloud host ECS of the production available area and the corresponding cloud host ECS 'of the disaster recovery available area respectively, and binding the management address IP in the ECS and the management address IP' in the ECS;
the first loading module is used for loading a first intelligent engine for application data synchronization in the ECS and the ECS' respectively;
the setting module is used for setting the service address and the IP of the ECS to be in an activated state, setting the service address of the ECS 'to be in an inactivated state and setting the IP' to be in an activated state.
In a fourth aspect, an embodiment of the present invention provides a disaster recovery switching device, where the switching device includes:
the judging module is used for canceling the production environment identification of the production available area and adding the production environment identification in the disaster recovery available area when the production available area is judged to be unavailable;
the system comprises a first adding module, a second adding module and a third adding module, wherein the first adding module is used for adding a sending end mark and a receiving end management address in a second intelligent engine of a cloud database of a disaster recovery available area;
the second adding module is used for adding a sending end mark and a receiving end management address in a first intelligent engine of a cloud host of the disaster tolerance available area, informing a Software Defined Network (SDN) controller of the disaster tolerance available area and activating a service address of the cloud host of the disaster tolerance available area;
and the activation module is used for notifying an SDN controller of the disaster tolerance available area and activating a load balancing example of the disaster tolerance available area.
In a fifth aspect, an embodiment of the present invention provides a device in a cloud disaster recovery architecture, where the device includes: at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the method of the first aspect of the embodiments described above or the method of the second aspect of the embodiments described above.
In a sixth aspect, embodiments of the present invention provide a computer-readable storage medium, on which computer program instructions are stored, which, when executed by a processor, implement the method of the first aspect in the above-mentioned implementation mode or the method of the second aspect in the above-mentioned implementation mode.
According to the scheme provided by the invention, when the user selects the disaster tolerance capability provided by the cloud platform, the cloud host consistent with the production environment is automatically created for the user in the disaster tolerance environment, the same service addresses are respectively bound in the cloud host of the production available area and the corresponding cloud host of the disaster tolerance available area, and the intelligent engines are respectively loaded in the two available areas, so that when subsequent disaster tolerance switching is carried out, the related codes or configuration files of IP addresses do not need to be modified, the automatic seamless switching of disaster tolerance can be realized, and the disaster tolerance switching efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart illustrating a configuration method of a disaster recovery architecture on a cloud based on a software solution according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a dual-active disaster recovery system based on global load balancing according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a cloud disaster recovery system based on a software solution according to an embodiment of the present invention;
fig. 4 shows a structural schematic diagram of a cloud disaster recovery function architecture based on a software solution according to an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating a load-balanced data synchronization mechanism according to an embodiment of the invention;
FIG. 6 is a schematic diagram illustrating an application data synchronization mechanism of a cloud host according to an embodiment of the invention;
FIG. 7 is a schematic diagram illustrating a data synchronization mechanism of a cloud database according to an embodiment of the invention;
fig. 8 is a flowchart illustrating a disaster recovery switching method according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a configuration device of a disaster recovery architecture based on a software solution according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a disaster recovery switching device according to an embodiment of the present invention;
fig. 11 shows a hardware configuration diagram of the apparatus according to the embodiment of the present invention.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The invention aims to automatically create a cloud host consistent with a production environment for a user in a disaster recovery environment when the user selects disaster recovery capability provided by a cloud platform, and respectively bind the same service address in the cloud host of a production available area and the corresponding cloud host of the disaster recovery available area, and respectively load an intelligent engine in the two available areas, so that when subsequent disaster recovery switching is carried out, IP address related codes or configuration files do not need to be modified, automatic seamless switching of disaster recovery can be realized, and the disaster recovery switching efficiency is improved. Various aspects of the invention are described in detail below.
< public cloud >
A public cloud generally refers to a shared resource service such as computing power, storage power, network power, database power, etc., which is provided by a third-party provider for unspecified users and can be directly accessed through the internet.
< private cloud >
A private cloud is a proprietary resource of computing, storage, networks, databases, etc. that is built for individual use by a client.
< elastic calculation Server >
An Elastic Computing Server (ECS) (sometimes also referred to as a cloud host) is a simple and efficient computing Service with elastically scalable processing capability, helps a user to quickly construct a more stable and secure application, improves operation and maintenance efficiency, reduces Information Technology (IT) cost, and enables the user to concentrate on core business innovation.
< Global load Balancing >
Global Load balancing (GSLB). The function is as follows: the traffic allocation among the servers in different regions on a wide area network (including the Internet) is realized, and the optimal server is ensured to be used for serving the client nearest to the server, so that the access quality is ensured.
< Server load Balancing >
Server Load Balance (SLB) supports traffic distribution to multiple ECSs to improve service capability of an application system, which has been the entry of a key service system for a long time.
< relational database service >
A Relational Database Service (RDS or Relational Database, RDB) is an on-line Database Service that is ready-to-use, stable, reliable, and flexible. The system has multiple safety protection measures and a perfect performance monitoring system, and provides a professional database backup, recovery and optimization scheme.
< elastic Block storage >
Elastic Block Store (EBS) is a Block level data Store provided for cloud server instances.
< software-defined network controller >
A Software Defined Network (SDN) controller is an application in a Software Defined Network and is responsible for flow control to ensure an intelligent Network.
Based on the above, an embodiment of the present invention may provide a method for configuring a cloud disaster recovery architecture based on a software scheme, and referring to fig. 1, fig. 1 shows a schematic flow chart of a method 100 for configuring a cloud disaster recovery architecture based on a software scheme according to an embodiment of the present invention, where the method includes:
s110, respectively creating cloud hosts with the same specification in a production available area and a disaster recovery available area of a disaster recovery architecture on the cloud according to the specification required by a user;
s120, respectively binding the same service address in the cloud host ECS of the production available area and the corresponding cloud host ECS ' of the disaster recovery available area, and binding a management address IP in the ECS and a management address IP ' in the ECS ';
s130, loading a first intelligent engine for application data synchronization in the ECS and the ECS' respectively;
s140, the service address and the IP of the ECS are set to be in an activated state, the service address of the ECS 'is set to be in an inactivated state, and the IP' is set to be in an activated state.
By utilizing the scheme provided by the invention, when the user selects the disaster tolerance capability provided by the cloud platform, the cloud host consistent with the production environment can be automatically created for the user in the disaster tolerance environment, the same service addresses are respectively bound in the cloud host of the production available area and the corresponding cloud host of the disaster tolerance available area, and the intelligent engines are respectively loaded in the two available areas, so that the automatic seamless switching of disaster tolerance can be realized without modifying IP address related codes or configuration files during the subsequent disaster tolerance switching, and the disaster tolerance switching efficiency is improved.
The following describes, by way of specific examples, alternative specific processes of embodiments of the present invention. It should be noted that the scheme of the present invention does not depend on a specific software scheme, and in practical applications, any known or unknown software, algorithm, program, or any combination thereof may be used to implement the scheme of the present invention, and the scheme of the present invention is within the protection scope of the present invention as long as the essential idea of the scheme of the present invention is adopted.
Referring to fig. 2, fig. 2 is a schematic structural diagram illustrating a dual-active disaster recovery system based on global load balancing according to an embodiment of the present invention.
At present, a double-active solution based on Global Server Load Balance (GSLB) is mainly adopted for realizing disaster recovery of a cloud service system.
The double-activity solution is a double-activity mode that double data centers simultaneously provide service production services to the outside, the two data centers are equal, do not divide a master and a slave, and can simultaneously deploy services, so that the utilization rate of resources and the working efficiency and performance of the system can be greatly improved.
As an example, referring to fig. 2, an ECS (electronic computer Service) binding different available areas under a global load balancing instance deploys a Service system application in the ECS in a distributed manner, so as to avoid unavailability of external services due to a failure of a single available area. And data disaster recovery in the same city is realized through a Relational Database Service (RDS) across multiple available areas.
The available Zone (Availability Zone) refers to one or more data centers in which infrastructures such as power and network are isolated from each other in the same area.
It should be noted that, in the above dual-active solution, the precondition for the service system to adopt the disaster-tolerant architecture includes the following items: first, a public cloud or a private cloud can provide global load balancing; second, public or private clouds can provide RDS that can enable data synchronization between different available zones; third, the cloud-based business system must adopt a distributed + stateless architecture design.
The public cloud refers to shared resource services such as computing power, storage power, network power, database power and the like which are provided by a third-party provider for unspecified users and can be directly accessed through the Internet. And the private cloud refers to a proprietary resource service such as computing capacity, storage capacity, network capacity, database capacity and the like which is constructed for being used by one customer alone.
In addition, whether the cloud is a public cloud or a private cloud, an application scenario exists, that is, a disaster tolerance capability is provided for the service system deployed on the cloud, so that the service continuity of the service system on the cloud when a disaster comes is improved.
As shown in fig. 2, the dual-active disaster recovery system based on global load balancing provides a good disaster recovery supporting scheme for the cloud on the service system with the distributed + stateless architecture, but the scheme currently has the following technical defects:
first, the distributed + stateless architecture is a good Information Technology (IT) system high availability design. However, a large number of IT systems still adopt a non-distributed or stateful traditional architecture, and when the traditional architecture is deployed in a public cloud or a private cloud, especially in the public cloud, IT is difficult to implement disaster recovery protection of a service system based on a global load balancing dual-active disaster recovery system.
The stateless service refers to a service without a special state, and each request is uniformly and indiscriminately processed for the Web server, that is, the Web server is only responsible for processing data submitted by each request of a user and then returning a processing result, and the Web server does not store any data (such as an IP, an account number, a password and the like) related to the request and stores the data in a background database server or a cluster. When any one Web server fails, other servers can acquire information from the background database server or the cluster without influencing the continuous operation of the request.
Stateful refers to the retention of previously requested information in the server for processing of the current request. Therefore, when any one Web server fails, the other Web servers need to process the request from the beginning because they do not have any data associated with the request.
A distributed system, i.e., having more than one server, and a distributed system is a software system built on top of a network. It is because of the nature of software that distributed systems have a high degree of coherence and transparency. The cohesiveness refers to the high autonomy of distributed nodes of each database and a local database management system; transparency means that each database distribution node is transparent to the user's application, not seen locally or remotely. In a distributed database system, a user does not feel that data is distributed, i.e., the user does not need to know whether a relationship is split, whether there is a copy, where the data resides, on which site a transaction is executed, etc.
In summary, although the stateless + distributed architecture is a very good IT system high availability design solution. However, in the current application, besides some large internet companies adopting the stateless architecture, a great number of IT systems still adopt the stateful traditional architecture, and the stateless application is limited to a certain extent.
Second, a large number of service systems have Internet Protocol (IP) address dependencies when communicating internally. For example, in an application scenario where data transmission across servers is performed in a File Transfer Protocol (ftp) manner, in such a scenario, an IP address of at least one of two communicating parties is required to be fixed, otherwise, an IP address related code or a configuration File needs to be modified.
However, a dual-active disaster recovery system based on global load balancing often needs to deploy a service system in two available areas, for example, two different machine rooms can be simply understood. Because the gateways of the two available areas are inconsistent, it is difficult to realize that the services of different available areas can use the same IP address when a disaster occurs, and at this time, the relevant codes or configurations of the IP addresses are often required to be changed to realize the normal operation of the service system.
In summary, to solve the above technical problems, embodiments of the present invention provide a method and an apparatus for implementing cloud disaster tolerance based on a Software Defined Network (SDN). The device has the core that the redefinition of the network is realized based on a software scheme, the unified management of the disaster tolerance capability of the service system is realized through a cloud disaster tolerance intelligent control device (hereinafter referred to as an intelligent control device), and the data synchronization is realized through an intelligent engine. When a user selects disaster tolerance capability provided by a cloud platform, the intelligent management and control device automatically creates a cloud host (including a Central Processing Unit (CPU), memory size, operating system, IP address and other system configurations), block storage, a database, load balancing and other services which are completely consistent with a production environment for the user in the disaster tolerance environment, and simultaneously starts an intelligent engine to realize data synchronization of a disaster tolerance side. When the disaster tolerance characteristic is triggered, the system automatically activates the disaster tolerance environment, inherits the configuration of the IP address and the like of the original production environment, and ensures the service continuity of the service system under the condition of not changing the system configuration.
Referring to fig. 3, fig. 3 is a schematic structural diagram illustrating a cloud disaster recovery system based on a software scheme according to an embodiment of the present invention.
In fig. 3, the cloud disaster recovery system according to the embodiment of the present invention includes an available area a and an available area B, where the available area a serves as a production environment; the available area B is used as disaster recovery environment and is not activated.
As an example, in the disaster recovery architecture shown in FIG. 3, in the available area A, cloud products SLB, ECS-1, ECS-2, RDS, and EBS are included. The service system distributes the request to ECS-1 and ECS-2 through load balancing SLB in the available area A, and various types of state data and service data are stored in EBS, so that the available area A can be used as a production environment. Wherein ECS-1, ECS-2 and RDS all use EBS to store data. Meanwhile, the same cloud service as the available area A, including SLB, ECS-1 ', ECS-2', RDS and EBS, is automatically created in the available area B, so that the available area B can be used as a disaster recovery environment. And the available area B is implemented to be in data consistency with the available area a, while the available area B is in a to-be-activated state. And when the cloud disaster tolerance intelligent control device detects that the disaster tolerance switching condition is met, activating the service of the available area B and providing the service to the outside so as to realize service continuity guarantee. In other words, before the cloud disaster recovery intelligent control device detects that the cloud disaster recovery intelligent control device meets the disaster recovery switching condition, the available area B is in the state to be activated.
The disaster recovery architecture provided by the embodiment of the invention can provide a capability for a public cloud or a private cloud, and can provide a general disaster recovery solution for a service system of any architecture of the upper cloud so as to supplement a double-active disaster recovery system.
Referring to fig. 4, fig. 4 is a schematic structural diagram illustrating a cloud disaster recovery functional architecture based on a software scheme according to an embodiment of the present invention.
In fig. 4, a cloud disaster recovery function architecture according to an embodiment of the present invention includes a cloud management platform, an intelligent management and control device, and an available area a and an available area B. The intelligent management and control device comprises a disaster tolerance function management and scheduling module, a cloud management platform scheduling module and an intelligent engine management and scheduling module. The available area A is used as a production environment, and the available area B is used as a disaster recovery environment. And each available zone includes an SDN controller, a computing resource pool, a network resource pool, a database resource pool, and a storage resource pool. The resource pool refers to a collection of various hardware and software involved in the cloud computing data center. A plurality of ECSs may be included in the pool of computing resources, the ECSs including an intelligent engine therein. For example, the pool of computing resources of available zone A may include ECS-1, ECS-2. The computing resource pool of available zone B may include ECS-1 ', ECS-2'; the network resource pool may include an SLB; the database resource pool can comprise a plurality of RDSs, and the RDS comprises an intelligent engine; the storage resource pool may include EBSs.
The core functions of the cloud disaster recovery architecture shown in fig. 4 are mainly implemented by an intelligent management and control device, an SDN controller, and an intelligent engine.
The intelligent management and control device mainly realizes application layer requirement analysis, control layer instruction issuing and disaster tolerance intelligent management; the SDN controller mainly realizes automatic configuration of network functions according to instructions of the intelligent management and control device; the intelligent engine mainly realizes the production data modularization synchronization function.
Automated deployment of cloud disaster tolerance environment
The cloud service requirement analysis and logic control according to the embodiment of the present invention will be described with reference to fig. 4.
First, a user submits a Cloud service requirement through a Cloud Management Platform (CMP), and the CMP distributes the Cloud service requirement to an intelligent Management and control device. Among them, the CMP is a product providing integrated management of public cloud, private cloud, and hybrid cloud.
Secondly, when the intelligent management and control device receives the cloud service requirement, the disaster recovery function management and scheduling module analyzes the cloud service requirement, and notifies the formulated deployment scheme to the cloud management platform for execution through the cloud management platform scheduling module. Specific deployment scenarios are discussed below.
In the first step, a deployment scheme of the available area is formulated. As an example, the "disaster recovery function management and scheduling module" determines the deployment scenario of the available area according to the disaster recovery requirement. In an embodiment of the present invention, a dual-available-area deployment scheme is adopted, and a "cloud management platform scheduling module" notifies a CMP to select an appropriate available area a and an appropriate available area B, and at the same time, a special identifier is added to the available area a to indicate that the available area a can be currently used as a production environment, and no identifier is added to the available area B to indicate that the available area B to which the identifier is not added can be currently used as a disaster recovery environment.
And secondly, making a deployment scheme of the cloud host.
As an example, first, according to the user requirement specification, wherein the user requirement specification may include, for example, a Central Processing Unit (CPU), a memory, etc., the "cloud management platform scheduling module" notifies the CMP to create two cloud hosts, including ECS-1 and ECS-2, in the available area A, and also create two cloud hosts, including ECS-1 'and ECS-2', in the available area B according to the same specification, as shown in FIG. 4.
Second, the smart regulation apparatus notifies the CMP to allocate two production IP addresses (e.g., service IP1, service IP2) and four built-in management IP addresses (e.g., management IP1, management IP2, management IP1 ', management IP 2'). And the intelligent management and control device informs the SDN controller of the available area A of completing the IP address binding. And notifies the SDN controller of the available zone B to complete the IP address binding.
For example, ECS-1 bound services IP1, management IP1, ECS-2 bound services IP2, management IP2, with services IP and management IP active; ECS-1 'binding service IP1, management IP 1', ECS-2 'binding service IP2 and management IP 2', wherein the service IP is not activated, and the management IP is activated.
And thirdly, the cloud management platform scheduling module informs the ECS of each available area to load the intelligent engine, so that the ECS is deployed.
And thirdly, establishing a deployment scheme of the SLB of a load balancing example (which can be simply called load balancing). As an example, the "cloud management platform scheduling module" informs the CMP to create load balancing instances in the two available zones, respectively. Wherein, the load balancing example of the available area A is activated, and the load balancing example of the available area B is not activated.
And fourthly, making a deployment scheme of the cloud database RDS. The "cloud management platform scheduling module" informs the CMP to create a database in the available area a and also create a database in the available area B according to the same specification. And loading the intelligent engines in the databases respectively.
And fifthly, establishing a deployment scheme of the elastic block storage EBS. The "cloud management platform scheduling module" notifies the CMP to create an elastic block store in the available area a, and also creates an elastic block store in the available area B according to the same specification. And the storage mounting of the cloud host and the cloud database of the available area to which the elastic block belongs is finished by the storage of the elastic block.
Based on the above examples, it can be understood that by using the above scheme provided by the present invention, when the user selects the disaster tolerance capability provided by the cloud platform, services such as a cloud host, a cloud database, a load balancing example, and the like, which are consistent with the production environment, can be automatically created for the user in the disaster tolerance environment, and the intelligent engine is started at the same time, so that when subsequent disaster tolerance switching is performed, no modification of the relevant codes or configuration files of the IP addresses is required, and thus, automatic seamless switching of disaster tolerance can be realized, and the disaster tolerance switching efficiency is improved.
In addition, the scheme provided by the invention has no limitation and requirement on the cloud system architecture, and can provide disaster recovery solutions for the cloud systems of various architectures. And the deployment of the production environment and the disaster recovery environment can be automatically realized in the selected two data centers based on the software scheme.
< data synchronization mechanism for load balancing >
Referring to fig. 5, fig. 5 is a schematic diagram illustrating a data synchronization mechanism of a load balancing example according to an embodiment of the present invention.
Firstly, a user completes load balancing strategy configuration on a load balancing example of an available area A and automatically reports related configuration information to an SDN controller of the available area A;
secondly, automatically generating a load balancing configuration metadata message by the SDN controller of the available area A according to a standard configuration template, and synchronizing the load balancing configuration metadata message to the SDN controller of the available area B;
and thirdly, the SDN controller of the available area B issues the received load balancing configuration metadata message to the load balancing of the available area B, so that the production data synchronization of the load balancing is completed.
< data synchronization mechanism of cloud host >
In the embodiment of the invention, the data synchronization between the cloud host of the production available area and the corresponding cloud host of the disaster recovery available area comprises configuration data synchronization and application data synchronization.
Firstly, the configuration data of the cloud host mainly realizes the synchronization of the configuration data between the corresponding cloud hosts of the two available areas in a mode that SDN controllers of the two available areas synchronously send the configuration data. The configuration data includes CPU and memory specifications, os type and version, network configuration, and the like.
And secondly, the application data between the cloud host of the production available area and the corresponding cloud host of the disaster recovery available area are synchronized mainly through an intelligent engine module. The application data includes operating system data and the like.
Referring to fig. 6, fig. 6 is a schematic diagram illustrating an application data synchronization mechanism of a cloud host according to an embodiment of the present invention.
As an example, as shown in fig. 6, in the ECS of each available zone, there are an application software layer, an intelligent engine layer, and an operating system layer. The intelligent engine layer comprises a file system management Software Development Kit (SDK), a log cache module and a data synchronization message queue. The application software layer of the ECS-1 is bound with the service IP1, and the intelligent engine layer is bound with the management IP 1; the application software layer of ECS-1 'binds service IP1 and the intelligent engine layer binds IP 1'.
The following describes the mechanism for synchronizing application data by taking ECS-1 in available zone a and ECS-1' in available zone B as an example. And, the ECS-2 of the available zone a and the ECS-2' of the available zone B implement application data synchronization by the same mechanism, and will not be described in detail herein.
In the first step, the intelligent management and control device adds an originating identifier and a terminating management address (such as management IP1 ') to the intelligent engine of the ECS-1 in the available area A through an intelligent engine management and scheduling module according to the state attribute that the available area belongs to the production environment or the disaster tolerance environment, and adds a terminating identifier and an originating management address (such as management IP1) to the intelligent engine of the ECS-1' in the available area B.
And secondly, in the originating ECS-1 of the available area A, when the intelligent engine is loaded, the copying of various data operation commands and Application Program Interfaces (API) of the operating system is mainly completed through the 'file system management SDK' in the intelligent engine. When the application software performs data operations (e.g., add, delete, change, etc.) on the ECS-1 through various commands or APIs provided by the operating system, it is the file system management commands and APIs provided by the called smart engine.
Thirdly, in the sending end ECS-1 of the available area A, the intelligent engine sends the operation command and the call record of the API to the log cache module through the file system management SDK, the log cache module generates a data operation log and calls the data synchronization message queue, and a data synchronization request is sent to the address (management IP1 ') of the receiving end ECS-1' of the available area B. If the receiving end ECS-1 ' does not respond, reporting to an intelligent engine management and scheduling module of the intelligent management and control device, caching data operation log information in the sending end ECS-1 in a file form, notifying reconnection after the intelligent engine management and scheduling module monitors that the sending end ECS-1 and the receiving end ECS-1 ' are normally connected, and implementing synchronization of the data operation logs between the sending end ECS-1 and the receiving end ECS-1 ' after the connection is successful.
Fourthly, when receiving the data synchronization request, the receiving end ECS-1' of the available area B first checks whether the originating address is an authorized address (i.e., determines whether the originating address is the management IP1), establishes a connection after the check is passed, performs data synchronization, and closes the connection after the synchronization is completed.
Fifthly, the receiving end ECS-1' of the available zone B simulates and executes the application data (for example, operating system data) of the cloud host one by one according to the received data operation log in sequence, so that the data synchronization of the application data (for example, operating system data) of the cloud host between the corresponding cloud hosts of the two different available zones (that is, the available zone a and the available zone B) is realized.
< data synchronization mechanism of cloud database >
Referring to fig. 7, fig. 7 is a schematic diagram illustrating a data synchronization mechanism of a cloud database according to an embodiment of the present invention. The data synchronization mechanism of the cloud database is similar to the application data synchronization mechanism of the cloud host, and is introduced below with reference to fig. 7:
as an example, as shown in fig. 7, in the cloud database (i.e., RDB) of each available region, there are a smart engine layer and a database software layer. The intelligent engine layer comprises a database management Software Development Kit (SDK), a log cache module and a data synchronization message queue. And an intelligent engine layer of the cloud database binds management IPs.
Firstly, adding an originating mark and a receiving end management address in an intelligent engine of a cloud database of an available area A by an intelligent engine management and scheduling module according to the state attribute of the available area belonging to a production environment or a disaster tolerance environment; meanwhile, a receiving end identifier and a sending end management address are added to an intelligent engine of the cloud database of the available area B.
And secondly, in the originating cloud database of the available area A, when the intelligent engine is loaded, rewriting various operation commands and APIs of database software is realized through 'database management SDK'. When the front end performs data operations (for example, adding, deleting, changing and the like) through various commands or APIs provided by database software, the front end is actually a database system management command and API provided by an intelligent engine of the called cloud database.
And thirdly, in the sending-end cloud database of the available area A, the intelligent engine sends the database operation record to a log cache module through a database management SDK, and the log cache module generates an operation execution log and calls a data synchronization message queue. And makes a data synchronization request to the receiving cloud database of the available area B. If the receiving end cloud database does not respond, the log information is reported to an intelligent engine management and scheduling module of the intelligent management and control device, the log information is cached in a node of the sending end cloud database in a file form, and after the intelligent engine management and scheduling module monitors that the sending end cloud database is normally connected with the receiving end cloud database, the intelligent engine management and scheduling module notifies that the connection is reconnected and succeeds, and then synchronization of the data execution log between the cloud database of the available area A and the corresponding cloud database of the available area B is implemented.
And fourthly, when the receiving end cloud database of the available area B receives the data synchronization request, firstly verifying whether the sending end address is an authorized address, establishing connection and implementing synchronization after the verification is passed, and closing the connection after the synchronization is completed.
And fifthly, the receiving cloud database of the available area B executes logs according to the received data, and the logs are simulated and executed one by one according to the sequence, so that data synchronization between the corresponding cloud databases of two different available areas is realized.
< mechanism for data synchronization of elastic Block storage >
The data stored in the elastic blocks do not need to be directly synchronized, but the consistency of the data stored in the elastic blocks in different available intervals is ensured through a data synchronization mechanism of the cloud host and a data synchronization mechanism of the cloud database in different available areas. In the foregoing embodiment, the data synchronization mechanism of the cloud host and the data synchronization mechanism of the cloud database in different available areas have been introduced, and are not described herein again.
Based on the above example, it can be understood that the data consistency can be guaranteed across hosts by using the cloud host data remote synchronization scheme provided by the invention; by utilizing the database data remote synchronization scheme provided by the invention, the data consistency can be guaranteed across database nodes.
In summary, based on the data synchronization method provided by the present invention, the data of the production environment can be automatically synchronized to the disaster recovery environment based on the software scheme. In addition, the implementation process of the embodiment of the invention has no limitation or requirement on the cloud system architecture, and can provide disaster recovery solutions for the cloud systems of various architectures.
Referring to fig. 8, the present invention further provides a disaster recovery switching method 800, which performs disaster recovery switching for a cloud disaster recovery architecture configured by the method shown in fig. 1, where the disaster recovery switching method includes:
s810, when the production available area is judged to be unavailable, the production environment identification of the production available area is cancelled, and the production environment identification is added to the disaster recovery available area;
s820, adding a sending end mark and a receiving end management address in a second intelligent engine of a cloud database of the disaster recovery available area;
s830, adding a sending end mark and a receiving end management address in a first intelligent engine of a cloud host of a disaster recovery available area, informing a Software Defined Network (SDN) controller of the disaster recovery available area, and activating a service address of the cloud host of the disaster recovery available area;
and S840, informing the SDN controller of the disaster recovery available area, and activating a load balancing instance of the disaster recovery available area.
By utilizing the scheme provided by the invention, the system can automatically activate the disaster recovery environment and inherit the configuration of the IP address and the like of the original production environment when the disaster recovery characteristic is triggered based on the software scheme, and can realize automatic switching without system configuration change, thereby improving the disaster recovery switching efficiency.
How to implement the disaster recovery automatic switching of four cloud services, i.e., load balancing, cloud hosts, elastic block storage, and cloud database, is described below with reference to fig. 4:
firstly, when the disaster recovery function management and scheduling module of the intelligent management and control device judges that the available area A is unavailable and needs to be switched, the cloud management platform is informed to cancel the production environment identifier of the available area A through the cloud management platform scheduling module, and the production environment identifier is added to the available area B.
And secondly, issuing an instruction by an intelligent engine management and scheduling module, and adding an originating identifier and a receiving management address in an intelligent engine of a cloud database (namely RDB) of the available region B.
Thirdly, an intelligent engine management and scheduling module issues an instruction, and a transmitting end mark and a receiving end management address are added to an intelligent engine of a cloud host machine in the available area B; and informing the SDN controller of the available area B, and activating service IP of two cloud hosts (including ECS-1 'and ECS-2') of the available area B.
And fourthly, informing the SDN controller of the available area B by the cloud management platform scheduling module, and activating the load balance of the available area B. Because the EBS is always mounted in the cloud host and the cloud database, and does not need to be activated or switched independently, the service system has successfully and automatically completed disaster recovery switching, and external services are recovered in the available area B.
Fifthly, because the available area B can be used as a production available area, and when the available area B is used as the production available area, a corresponding disaster recovery available area is not deployed, when a disaster occurs, the service continuity of the service system on the cloud cannot be ensured; therefore, when the available area a returns to normal, the data of the available area B needs to be synchronized to the available area a, and the available area a can be used as the disaster recovery available area. The method comprises the following specific steps:
when the disaster recovery function management and scheduling module monitors and finds that the available area A is recovered to be normal, the cloud management platform scheduling module informs an SDN controller of the available area A, service IPs and load balance of two cloud hosts of the available area A are set to be in an inactive state, and a receiving end mark and a sending end management address are added to a cloud database of the available area A and an intelligent engine of the cloud hosts through the intelligent engine management and scheduling module, so that data synchronization with the available area B is realized.
Corresponding to the configuration method of the cloud disaster recovery architecture based on the software scheme, the invention also provides a configuration device, equipment and a computer storage medium of the cloud disaster recovery architecture based on the software scheme.
Referring to fig. 9, fig. 9 is a schematic structural diagram illustrating a configuration apparatus 900 of a cloud disaster recovery architecture based on a software scheme according to an embodiment of the present invention, where the configuration apparatus of the cloud disaster recovery architecture based on the software scheme includes:
a first creating module 910, configured to create cloud hosts with the same specification in a production available area and a disaster recovery available area of a disaster recovery architecture on a cloud according to a user requirement specification;
a binding module 920, configured to bind the same service address in the cloud host ECS in the production available area and the corresponding cloud host ECS 'in the disaster recovery available area, and bind the management address IP in the ECS and the management address IP' in the ECS;
a first loading module 930 for loading a first smart engine for application data synchronization in the ECS and ECS', respectively;
a setting module 940, configured to set the service address and the IP of the ECS to an active state, set the service address of the ECS 'to an inactive state, and set the IP' to an active state.
The above-described apparatus and computer storage medium for a cloud disaster recovery architecture are described in detail below.
By utilizing the configuration device, equipment and computer storage medium of the on-cloud disaster recovery architecture provided by the invention, when a user selects disaster recovery capability provided by a cloud platform, a cloud host consistent with a production environment can be automatically created for the user in a disaster recovery environment, the same service addresses are respectively bound in the cloud host of a production available area and the corresponding cloud host of the disaster recovery available area, and the intelligent engines are respectively loaded in the two available areas, so that when subsequent disaster recovery switching is carried out, IP address related codes or configuration files do not need to be modified, the automatic seamless switching of disaster recovery can be realized, and the disaster recovery switching efficiency is improved. .
Corresponding to the disaster recovery switching method in the embodiment of the invention, the invention also provides a disaster recovery switching device, equipment and a computer storage medium.
Referring to fig. 10, fig. 10 shows a schematic structural diagram of a disaster recovery switching device 1000 according to an embodiment of the present invention, where the disaster recovery switching device 1000 includes:
a judging module 1010, configured to cancel the production environment identifier of the production available area and add the production environment identifier to the disaster recovery available area when it is determined that the production available area is unavailable;
a first adding module 1020, configured to add a sending end flag and a receiving end management address to a second intelligent engine of a cloud database in a disaster tolerance available area;
a second adding module 1030, configured to add a sending end flag and a receiving end management address to a first intelligent engine of a cloud host in a disaster recovery available area, notify a software defined network SDN controller in the disaster recovery available area, and activate a service address of the cloud host in the disaster recovery available area;
an activating module 1040, configured to notify an SDN controller of the disaster tolerance available area, and activate a load balancing instance of the disaster tolerance available area.
The above-described apparatus and computer storage medium for a cloud disaster recovery architecture are described in detail below.
By utilizing the switching device, the equipment and the computer storage medium of the cloud disaster recovery architecture, which are provided by the invention, based on a software scheme, when the disaster recovery characteristic is triggered, the system can automatically activate the disaster recovery environment and inherit the configuration such as the IP address of the original production environment, so that the automatic switching can be realized without changing the system configuration, and the disaster recovery switching efficiency is improved.
The equipment of the disaster recovery architecture on the cloud comprises:
a memory for storing a program;
a processor, configured to run the program stored in the memory, so as to execute the configuration method of the disaster recovery architecture on the cloud or each step in the disaster recovery switching method according to the embodiment of the present invention.
Fig. 11 is a block diagram illustrating an exemplary hardware architecture capable of implementing the method and apparatus according to the embodiments of the present invention, for example, an apparatus of a disaster recovery architecture based on a software scheme according to the embodiments of the present invention. Wherein computing device 1100 includes input device 1101, input interface 1102, processor 1103, memory 1104, output interface 1105, and output device 1106.
The input interface 1102, the processor 1103, the memory 1104, and the output interface 1105 are connected to each other via a bus 1110, and the input device 1101 and the output device 1106 are connected to the bus 1110 via the input interface 1102 and the output interface 1105, respectively, and further connected to other components of the computing device 1100.
Specifically, the input device 1101 receives input information from the outside and transmits the input information to the processor 1103 through the input interface 1102; the processor 1103 processes the input information based on the computer-executable instructions stored in the memory 1104 to generate output information, stores the output information temporarily or permanently in the memory 1104, and then transmits the output information to the output device 1106 through the output interface 1105; the output device 1106 outputs output information external to the computing device 1100 for use by a user.
The computing device 1100 may perform the steps of the above-described configuration method or switching method of the present invention.
The processor 1103 may be one or more Central Processing Units (CPUs). In the case where the processor 1103 is one CPU, the CPU may be a single-core CPU or a multi-core CPU.
The memory 1104 may be, but is not limited to, one or more of Random Access Memory (RAM), Read Only Memory (ROM), Erasable Programmable Read Only Memory (EPROM), compact disc read only memory (CD-ROM), a hard disk, and the like. The memory 1104 is used for storing program code.
It is understood that the functions of any or all of the modules provided in the embodiments of the present invention may be implemented by the central processing unit 1103 shown in fig. 11.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When used in whole or in part, can be implemented in a computer program product that includes one or more computer instructions. When loaded or executed on a computer, cause the processes or functions described in accordance with embodiments of the invention, to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL), or wireless (e.g., infrared, wireless, microwave, etc.)). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
All parts of the specification are described in a progressive manner, and the same and similar parts among the various embodiments can be mutually referred to, and each embodiment is mainly described in different points from other embodiments. In particular, as to the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple and reference may be made to the description of the method embodiments in relevant places.

Claims (15)

1. A configuration method of a disaster recovery architecture on a cloud comprises the following steps:
according to the specification of user requirements, respectively creating cloud hosts with the same specification in a production available area and a disaster recovery available area of the on-cloud disaster recovery architecture;
respectively binding the same service address in the cloud host ECS of the production available area and the corresponding cloud host ECS ' of the disaster recovery available area, and binding a management address IP in the ECS and a management address IP ' in the ECS ';
loading a first intelligence engine for application data synchronization in the ECS and ECS', respectively;
and setting the service address and the IP of the ECS to be in an activated state, setting the service address of the ECS 'to be in an inactivated state, and setting the IP' to be in an activated state.
2. The method of claim 1, further comprising:
respectively creating cloud databases with the same specification in the production available area and the disaster recovery available area;
and respectively loading a second intelligent engine for data synchronization in the cloud database RDS of the production available area and the cloud database RDS' of the disaster tolerance available area.
3. The method of claim 1, further comprising:
creating load balancing examples with the same specification in the production available area and the disaster recovery available area, setting the load balancing examples of the production available area to be in an activated state, and setting the load balancing examples of the disaster recovery available area to be in an inactivated state.
4. The method of claim 2, further comprising:
creating an elastic block storage EBS in the production available area, creating an elastic block storage EBS 'with the same specification in the disaster recovery available area, completing storage mounting of the ECS and the RDS in the EBS, and completing storage mounting of the ECS' and the RDS 'in the EBS'.
5. The method according to claim 1, before creating cloud hosts with the same specification in the production available area and the disaster recovery available area of the disaster recovery architecture on the cloud according to the user requirement specification, respectively, further comprising:
according to the disaster tolerance requirement of the service system to be cloud, a cloud service deployment scheme aiming at the production available area and the disaster tolerance available area of the service system to be cloud in the cloud is formulated.
6. The method according to claim 1, wherein after the cloud hosts with the same specification are respectively created in the production available area and the disaster recovery available area of the on-cloud disaster recovery architecture according to the user requirement specification, the method further comprises:
and completing the synchronization of the configuration data between the ECS and the ECS' in a mode of synchronously issuing the configuration data in the production available area and the disaster recovery available area.
7. The method according to claim 6, wherein after the synchronizing the configuration data between the ECS and the ECS' is completed by sending the configuration data synchronously between the production available zone and the disaster recovery available zone, the method further comprises:
adding an originating mark and an originating management address in a first intelligent engine of the ECS, and adding a receiving mark and an originating management address in a first intelligent engine of the ECS';
a first intelligent engine of the ECS rewrites a data operation command and an Application Program Interface (API) of an operating system, generates a data operation log according to the data operation command and the API, and sends a data synchronization request to the ECS';
and the ECS 'verifies whether the originating address is an authorized address, establishes connection between the ECS and the ECS' after the verification is passed, and completes synchronization of the data operation log.
8. The method according to claim 2, further comprising, after creating cloud databases with the same specification in the production available area and the disaster recovery available area, respectively:
adding a transmitting end mark and a receiving end management address in a second intelligent engine of the RDS, and adding a receiving end mark and a transmitting end management address in a second intelligent engine of the RDS';
the second intelligent engine of the RDS rewrites a data operation command and an Application Program Interface (API) of database software and generates a data execution log according to the data operation command and the API, and the RDS sends a data synchronization request to the RDS';
and the RDS 'verifies whether the originating address is an authorized address, establishes connection between the RDS and the RDS' after the verification is passed, and completes the synchronization of the data execution log.
9. The method of claim 3, wherein after creating the same sized load balancing instance in the production available zone and the disaster recovery available zone, further comprising:
completing load balancing strategy configuration on a load balancing example of the production available area, and reporting load balancing strategy configuration information to a Software Defined Network (SDN) controller of the production available area;
the SDN controller of the production available area generates a load balancing configuration metadata message according to the load balancing strategy configuration information, and synchronizes the metadata message to the SDN controller of the disaster tolerance available area;
and the SDN controller of the disaster recovery available area issues the metadata message to a load balancing example of the disaster recovery available area, so that the production data synchronization between the load balancing examples of the production available area and the disaster recovery available area is completed.
10. A disaster recovery switching method for a disaster recovery architecture on a cloud configured according to the method of any one of claims 1-9, comprising:
when the production available area is judged to be unavailable, the production environment identification of the production available area is cancelled, and the production environment identification is added to the disaster recovery available area;
adding a sending end mark and a receiving end management address in a second intelligent engine of a cloud database of the disaster recovery available area;
adding a sending end mark and a receiving end management address in a first intelligent engine of a cloud host of the disaster recovery available area, informing a Software Defined Network (SDN) controller of the disaster recovery available area, and activating a service address of the cloud host of the disaster recovery available area;
and informing an SDN controller of the disaster tolerance available area, and activating a load balancing example of the disaster tolerance available area.
11. The disaster recovery switching method according to claim 10, wherein the notifying the SDN controller of the disaster recovery available area further includes, after activating the load balancing instance of the disaster recovery available area:
when the production available area is recovered to be normal, informing an SDN controller of the production available area, setting a service address and a load balancing instance of a cloud host of the production available area to be in an inactive state, and adding a receiving end mark and a sending end management address in a second intelligent engine of a cloud database of the production available area and a first intelligent engine of the cloud host to realize data synchronization with the disaster recovery available area.
12. A device for configuring a disaster recovery architecture on a cloud, the device comprising:
the system comprises a first establishing module, a second establishing module and a third establishing module, wherein the first establishing module is used for respectively establishing cloud hosts with the same specification in a production available area and a disaster recovery available area of the on-cloud disaster recovery architecture according to a user requirement specification;
the binding module is used for binding the same service address in the cloud host ECS of the production available area and the corresponding cloud host ECS 'of the disaster recovery available area respectively, and binding a management address IP in the ECS and a management address IP' in the ECS;
a first loading module, configured to load a first smart engine for application data synchronization in the ECS and the ECS', respectively;
and the setting module is used for setting the service address and the IP of the ECS to be in an activated state, setting the service address of the ECS 'to be in an inactivated state and setting the IP' to be in an activated state.
13. A disaster recovery switching device, said device comprising:
the judging module is used for canceling the production environment identifier of the production available area and adding the production environment identifier in the disaster recovery available area when the production available area is judged to be unavailable;
the first adding module is used for adding a sending end mark and a receiving end management address in a second intelligent engine of the cloud database of the disaster recovery available area;
a second adding module, configured to add a sending-end flag and a receiving-end management address to a first intelligent engine of a cloud host in the disaster recovery available area, notify a software defined network SDN controller of the disaster recovery available area, and activate a service address of the cloud host in the disaster recovery available area;
and the activation module is used for notifying an SDN controller of the disaster recovery available area and activating a load balancing example of the disaster recovery available area.
14. An apparatus of a disaster recovery architecture on a cloud, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the method of any of claims 1-9 or the method of any of claims 10-11.
15. A computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any of claims 1-9 or the method of any of claims 10-11.
CN201811318053.2A 2018-11-07 2018-11-07 Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium Pending CN111158949A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811318053.2A CN111158949A (en) 2018-11-07 2018-11-07 Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811318053.2A CN111158949A (en) 2018-11-07 2018-11-07 Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111158949A true CN111158949A (en) 2020-05-15

Family

ID=70555073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811318053.2A Pending CN111158949A (en) 2018-11-07 2018-11-07 Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111158949A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111683139A (en) * 2020-06-05 2020-09-18 北京百度网讯科技有限公司 Method and apparatus for balancing load
WO2022012310A1 (en) * 2020-07-13 2022-01-20 华为技术有限公司 Communication method and apparatus
CN114090333A (en) * 2021-10-20 2022-02-25 中核核电运行管理有限公司 Disaster tolerance switching management system and method for production management platform
CN114285832A (en) * 2021-05-11 2022-04-05 鸬鹚科技(深圳)有限公司 Disaster recovery system, method, computer device and medium for multiple data centers

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055605A (en) * 2009-11-11 2011-05-11 中兴通讯股份有限公司 Disaster tolerance system and method applied to AAA (authentication, authorization and accounting) server
CN104717083A (en) * 2013-12-13 2015-06-17 中国移动通信集团上海有限公司 Disaster tolerant switching system, method and device for A-SBC equipment
CN107241430A (en) * 2017-07-03 2017-10-10 国家电网公司 A kind of enterprise-level disaster tolerance system and disaster tolerant control method based on distributed storage
CN108512693A (en) * 2018-02-24 2018-09-07 国家计算机网络与信息安全管理中心 A kind of trans-regional disaster recovery method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055605A (en) * 2009-11-11 2011-05-11 中兴通讯股份有限公司 Disaster tolerance system and method applied to AAA (authentication, authorization and accounting) server
CN104717083A (en) * 2013-12-13 2015-06-17 中国移动通信集团上海有限公司 Disaster tolerant switching system, method and device for A-SBC equipment
CN107241430A (en) * 2017-07-03 2017-10-10 国家电网公司 A kind of enterprise-level disaster tolerance system and disaster tolerant control method based on distributed storage
CN108512693A (en) * 2018-02-24 2018-09-07 国家计算机网络与信息安全管理中心 A kind of trans-regional disaster recovery method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴礼乐: "基于双活容灾存储技术的云计算数据中心的设计及应用", 《电子设计工程》 *
吴礼乐: "基于双活容灾存储技术的云计算数据中心的设计及应用", 《电子设计工程》, vol. 23, no. 06, 20 March 2015 (2015-03-20), pages 190 - 192 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111683139A (en) * 2020-06-05 2020-09-18 北京百度网讯科技有限公司 Method and apparatus for balancing load
WO2022012310A1 (en) * 2020-07-13 2022-01-20 华为技术有限公司 Communication method and apparatus
CN114285832A (en) * 2021-05-11 2022-04-05 鸬鹚科技(深圳)有限公司 Disaster recovery system, method, computer device and medium for multiple data centers
CN114090333A (en) * 2021-10-20 2022-02-25 中核核电运行管理有限公司 Disaster tolerance switching management system and method for production management platform

Similar Documents

Publication Publication Date Title
US11445019B2 (en) Methods, systems, and media for providing distributed database access during a network split
CN114787781B (en) System and method for enabling high availability managed failover services
US11687555B2 (en) Conditional master election in distributed databases
CN113169952B (en) Container cloud management system based on block chain technology
US10922303B1 (en) Early detection of corrupt data partition exports
US8954786B2 (en) Failover data replication to a preferred list of instances
US9344494B2 (en) Failover data replication with colocation of session state data
CN112000448A (en) Micro-service architecture-based application management method
CN112099918A (en) Live migration of clusters in containerized environments
CN111158949A (en) Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium
CN111130835A (en) Data center dual-active system, switching method, device, equipment and medium
CN109542611A (en) Database, that is, service system, database dispatching method, equipment and storage medium
CN113489691B (en) Network access method, network access device, computer readable medium and electronic equipment
US11953997B2 (en) Systems and methods for cross-regional back up of distributed databases on a cloud service
US11461156B2 (en) Block-storage service supporting multi-attach and health check failover mechanism
CN106796537B (en) Distributed components in a computing cluster
CN110543315B (en) Distributed operating system of kbroker, storage medium and electronic equipment
US9760370B2 (en) Load balancing using predictable state partitioning
CN114422331A (en) Disaster tolerance switching method, device and system
CN111818188B (en) Load balancing availability improving method and device for Kubernetes cluster
CN114615268B (en) Service network, monitoring node, container node and equipment based on Kubernetes cluster
CN116954816A (en) Container cluster control method, device, equipment and computer storage medium
US10481963B1 (en) Load-balancing for achieving transaction fault tolerance
CN114900449A (en) Resource information management method, system and device
CN113032477A (en) Long-distance data synchronization method and device based on GTID and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200515

RJ01 Rejection of invention patent application after publication