CN117743033A - Disaster recovery plan management method, system, electronic equipment and storage medium - Google Patents

Disaster recovery plan management method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN117743033A
CN117743033A CN202311780719.7A CN202311780719A CN117743033A CN 117743033 A CN117743033 A CN 117743033A CN 202311780719 A CN202311780719 A CN 202311780719A CN 117743033 A CN117743033 A CN 117743033A
Authority
CN
China
Prior art keywords
workflow
disaster recovery
node
nodes
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311780719.7A
Other languages
Chinese (zh)
Inventor
罗强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eisoo Information Technology Co Ltd
Original Assignee
Shanghai Eisoo Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eisoo Information Technology Co Ltd filed Critical Shanghai Eisoo Information Technology Co Ltd
Priority to CN202311780719.7A priority Critical patent/CN117743033A/en
Publication of CN117743033A publication Critical patent/CN117743033A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a disaster recovery plan management method, a disaster recovery plan management system, electronic equipment and a storage medium, and relates to the technical field of computers. The disaster recovery plan management method comprises the following steps: generating a corresponding disaster recovery plan according to service demands of a service system to be recovered, and determining workflow nodes corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration; generating a disaster recovery workflow according to the configuration attribute parameters corresponding to the workflow nodes in the disaster recovery plan and the disaster recovery configuration; and managing the disaster recovery workflow according to the hierarchical structure of the service system to be recovered. The embodiment of the invention realizes that the disaster recovery workflow is taken as a carrier to manage the disaster recovery plan, and simultaneously manages the disaster recovery workflow according to the hierarchical structure of the service system to be recovered, thereby being convenient for quickly selecting the adaptive disaster recovery plan to deal with the disaster failure.

Description

Disaster recovery plan management method, system, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a disaster recovery plan management method, system, electronic device, and storage medium.
Background
Under the background of big data age, the information technology business systems of all industries around the world are well established and gradually perfected, but threats and disasters of various business systems occur, such as rampant lux virus infection, system breakdown, infrastructure faults depending on the business systems and the like, and direct economic loss and indirect reputation loss caused to organizations by business downtime supported by the business systems are huge.
Traditional solutions include solutions by data backup restoration. When a certain type of faults occur in a service system, personnel detect and judge that data recovery is needed, coordinate a server and store, build an application host, install an operating system, deploy application software, configure a network, perform data recovery, verify that normal application of data is available, switch service flow and the like, and even if personnel operate proficiently to perform service recovery, downtime is inevitably long, the problem of service continuity of the service system is not well solved, and a corresponding disaster recovery method cannot be timely determined.
Disclosure of Invention
The invention provides a disaster recovery plan management method, a disaster recovery plan management system, electronic equipment and a storage medium, which are used for solving the problem that a disaster recovery plan cannot be determined in time when a business system fails.
According to an aspect of the present invention, there is provided a disaster recovery plan management method, wherein the method includes:
generating a corresponding disaster recovery plan according to service demands of a service system to be recovered, and determining workflow nodes corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration;
generating a disaster recovery workflow according to the configuration attribute parameters corresponding to the workflow nodes in the disaster recovery plan and the disaster recovery configuration;
and managing the disaster recovery workflow according to the hierarchical structure of the service system to be recovered.
According to another aspect of the present invention, there is provided a disaster recovery plan management system, wherein the apparatus comprises:
the system comprises a plan generation module, a disaster recovery module and a workflow node generation module, wherein the plan generation module is used for generating a corresponding disaster recovery plan according to the service requirement of a service system to be recovered and determining the workflow node corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration;
A workflow generating module, configured to generate a disaster recovery workflow according to the configuration attribute parameters corresponding to the workflow nodes in the disaster recovery plan and the disaster recovery configuration;
and the workflow management module is used for managing the disaster recovery workflow according to the hierarchical structure of the service system to be recovered.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the disaster recovery plan management method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the disaster recovery plan management method according to any one of the embodiments of the present invention when executed.
According to the technical scheme of the embodiment of the invention, the corresponding disaster recovery plan is generated according to the service requirement of the service system to be recovered, the workflow nodes corresponding to the disaster recovery plan are determined, the disaster recovery workflow is generated according to the configuration attribute parameters of the corresponding workflow nodes in the disaster recovery plan and the disaster recovery configuration, the disaster recovery workflow is managed according to the hierarchical structure of the service system to be recovered, the disaster recovery plan is managed by taking the disaster recovery workflow as a carrier, meanwhile, the disaster recovery workflow is managed according to the hierarchical structure of the service system to be recovered, different service systems to be recovered, different fault scenes and different granularity faults are realized, and the disaster recovery plan which is self-adaptive is realized, so that when a certain fault occurs, the corresponding disaster recovery plan can be quickly selected to cope with.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a disaster recovery plan management method according to a first embodiment of the present invention;
FIG. 2 is a flow chart of another disaster recovery plan management method provided in accordance with a second embodiment of the present invention;
FIG. 3 is a schematic diagram of a disaster recovery workflow generation provided in accordance with a third embodiment of the present invention;
FIG. 4 is a schematic diagram of a workflow node configuration provided in accordance with a third embodiment of the invention;
FIG. 5 is a schematic diagram of a disaster recovery workflow generation provided in accordance with a third embodiment of the present invention;
FIG. 6 is a schematic diagram of managing a disaster recovery plan according to a hierarchy provided in accordance with a third embodiment of the present invention;
FIG. 7 is a schematic diagram of a workflow node binding provided in accordance with a third embodiment of the invention;
FIG. 8 is a schematic diagram of a disaster recovery workflow replication provided in accordance with a third embodiment of the invention;
fig. 9 is a schematic diagram of a disaster recovery plan management device according to a fourth embodiment of the present invention;
fig. 10 is a schematic structural diagram of an electronic device implementing a disaster recovery plan management method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a disaster recovery plan management method according to a first embodiment of the present invention, which is applicable to managing a disaster recovery plan, and is convenient for querying a corresponding disaster recovery plan according to a requirement, and the method may be performed by a disaster recovery plan management device, which may be implemented in the form of hardware and/or software, and the disaster recovery plan management device may be configured in an electronic device. As shown in fig. 1, the method includes:
s110, generating a corresponding disaster recovery plan according to service demands of a service system to be recovered, and determining workflow nodes corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration.
The service system to be recovered may be understood as a service system to be subjected to disaster recovery planning management. In the actual operation process, the service system to be recovered may be various application systems, and the service type of the service system to be recovered may not be limited. Disaster recovery planning can be understood as various disaster or fault scenarios of the business system to be recovered, and pre-established emergency measures and treatment plans. Workflow description information may be included in the disaster recovery plan. The service requirement may refer to a requirement that a fault may exist in the service system to be recovered, and a function that may be implemented in the service system to be recovered may be used as the service requirement of the service system to be recovered.
Workflow description information may be understood as information describing flow logic, execution order, execution task information, node type, etc. between different workflow nodes corresponding to a disaster recovery plan. During actual operation, the workflow description information may be used to generate a corresponding disaster recovery workflow.
The workflow nodes refer to individual flow links in the workflow and are responsible for scheduling and executing independent functions configured by each workflow flow link. According to the business requirements, executable business processes for processing the business requirements can be generated, namely, a workflow is taken as a carrier to manage the disaster recovery plan. In the actual operation process, each workflow node can be used as a recovery step of a system function point to be recovered corresponding to the service requirement; alternatively still, each workflow node may correspond to a functional point of a business requirement.
The disaster recovery configuration may be understood as a disaster recovery configuration capability possessed by a disaster recovery workflow, and illustratively, the disaster recovery configuration may include, but is not limited to, policy configuration, disaster backup configuration, application configuration, validation configuration, cleaning configuration, network configuration, virus killing configuration, reporting configuration, and script configuration. The policy configuration may be understood as providing a disaster recovery policy including automatic triggering after the backup/replication is completed and periodic automatic triggering, where the periodic automatic triggering includes periodic selection of the latest backup/replication data triggering, periodic selection of batch automatic triggering of all backup/replication data within a certain period of time, and when the periodic triggering is performed, the disaster recovery policy meeting different requirements is triggered automatically after the configuration is sensitive and the time is required to be removed; disaster recovery configuration can be understood as an application component of a service system, such as a file, a database, middleware, a host, etc., which provides data recovery/mount recovery for backup data/replication data, and recovers the data into disaster recovery resources; application configuration can be understood as providing for modification of the content of a configuration file of a business system/business system component configuration file, including an ini/xml/json and the like format, and restarting the process/service/network of the business system/application component, so that the configuration modification takes effect of the application configuration; after service takeover or exercise, the post-recovery data consistency verification and application availability verification are provided for the service system/service system components in the disaster recovery resource. The verification configuration can be understood as data consistency verification after recovery, verifying whether the production system data is consistent with the data fingerprints of the data in the disaster recovery resources after recovery/replication, and verifying the application availability, by verifying the service system/application components in the disaster recovery resources after recovery/replication, the service state, the port monitoring state, the process state, the host network, the interface (Application Programming Interface, API) response state between hypertext transfer protocol (Hypertext Transfer Protocol, HTTP) programs, whether normal opening through application software is possible, etc.; cleaning configuration can be understood as providing for deletion of physical data of the business system/business system components and restoration of the state of the business system/business system components; network configuration may be understood as providing for configuration of host network cards, switches, routers, and mapping of domain names of domain name servers (Domain Name Server, DNS) to internetworking protocols (Internet Protocol, IP); virus killing configuration can be understood as providing virus scanning, virus file repair, virus deletion, virus file quarantine for recovery/replication data, and parallel virus killing control for the configuration of the number of instances of the killing engine; reporting configuration may be understood as providing for customizing the content of a disaster recovery report template, recording content including data recovery, post-recovery data consistency verification/application availability verification, recovery resource cleanup, virus kill, data recovery time (Recovery Time Objective, RTO), data recovery point objective (Recovery Point Objective, RPO), service switching/exercise results, etc., and may generate mail and send it. Script configuration can be understood as providing standard script execution and response return formats, providing metadata management of parameters related to scripts, setting initialization parameters and supporting adding and deleting parameters as required, and performing custom function configuration on the complete functions unsuitable for the configuration by writing the scripts, so that disaster recovery workflow can be generated according to the configuration optimization in a disaster recovery plan.
In the embodiment of the invention, the service requirement of the service system to be recovered can be acquired, the function corresponding to the service requirement is determined, the information describing the flow logic, the execution sequence, the execution task information, the node type and the like among different workflow nodes is generated according to the functional point of the system to be recovered as a disaster recovery plan, the disaster recovery configuration required by a user is determined, and then the recovery step corresponding to the functional point of the system to be recovered in the disaster recovery plan is determined as the workflow node.
S120, generating a disaster recovery workflow according to the configuration attribute parameters of the corresponding workflow nodes in the disaster recovery plan and the disaster recovery configuration.
The configuration attribute parameter refers to an attribute parameter configured for the workflow nodes, and by way of example, the configuration attribute parameter may include, but is not limited to, flow logic between workflow nodes, execution sequence, and execution task information, and may be considered as attribute information defined for the workflow nodes. In the actual operation process, an upstream node of the workflow node, an upstream node input parameter, a service parameter, an execution condition, a capability list and the like can be defined, and a self-defined function, abnormal processing of function output, an output parameter, a circulation rule and the like can be realized according to the workflow node input. In an embodiment, configuration scripts, node input parameters, and node output parameters may be set for the workflow node to enable the workflow node to satisfy the corresponding recovery steps or recovery functions. In an embodiment, the configuration script may include, but is not limited to, JSON or YAML formats.
Disaster recovery workflow refers to orchestrating an executable business process comprising a series of functional processes according to business needs. Disaster recovery planning may be performed automatically in accordance with a disaster recovery workflow. The disaster recovery workflow may be generated in accordance with a disaster recovery plan. During actual operation, the disaster recovery workflow may be orchestrated according to the configuration attribute parameters of the workflow nodes. For different fault scenes of the service system to be recovered, disaster recovery workflow of service switching, simulation drilling, actual combat drilling and service recovery can be correspondingly generated.
In actual operation, the different disaster recovery workflows are generally consistent in flow, and similar disaster recovery workflows are each rearranged, repeated in workload, and heavy in work and prone to errors. Therefore, the existing disaster recovery workflow can be duplicated, only part of workflow node configuration is modified, and disaster recovery plans corresponding to other scenes can be quickly generated. The system has the capability of automatically arranging and generating disaster recovery workflow, and improves arranging efficiency.
In the embodiment of the invention, configuration attribute parameters of workflow nodes such as flow logic, execution sequence, execution task information and the like in a disaster recovery plan can be extracted, configuration scripts, node input parameters and node output parameters are extracted to configure each workflow node, and then each workflow node is connected according to the flow logic and the execution sequence among the workflow nodes to generate a disaster recovery workflow. During actual operation, the workflow nodes may be orchestrated by an orchestration engine to generate a disaster recovery workflow. In order to meet the business requirements of the disaster recovery plan, the execution sequence among the workflow nodes can be connected through arrows, the association relation between the upstream workflow nodes and the downstream workflow nodes is recorded, and disaster recovery configuration is set for the disaster recovery workflow.
In the application process, however, in order to ensure the scheduling efficiency and scheduling judgment logic, the parallel arrangement and conditional branch arrangement of the disaster recovery workflow need to be controlled by the gateway node. Wherein, a plurality of nodes connected with gateway outlets of the parallel gateway nodes can be executed in parallel; after all the nodes connected to the gateway entrance by the parallel synchronous gateway node are executed, the flow can be pushed backwards; each of the conditional gateway nodes, which is connected to the gateway egress, has an associated execution condition, only nodes that satisfy the condition being executed. These conditions are mutually exclusive and cannot be met at the same time; the multiple-choice gateway node corresponds to each node connected with the gateway outlet, and has an associated execution condition, and only the nodes meeting the condition are executed. These conditions are not mutually exclusive, and at the same time, it is possible that multiple conditions are satisfied, so that multiple nodes will be executed. The multi-path merging gateway node is matched with the multi-path selecting gateway node for use, and the flow can be advanced backwards only after all the nodes which are executed and meet the conditions are executed. The event gateway can only advance the flow back if it receives a specified event or message. For the flow logic between the workflow nodes, the gateway nodes corresponding to the workflow nodes can be determined, so that the disaster recovery workflow can be scheduled according to the requirements.
In the actual operation process, the execution scheduling mode of the scheduled disaster recovery workflow can comprise serial scheduling, parallel scheduling and conditional branch scheduling modes. The workflow nodes are sequentially scheduled and executed according to the sequence; parallel scheduling, wherein parallel split gateways exist, and multilinks are simultaneously scheduled and executed; and (3) conditional branch scheduling, wherein an execution link is determined according to the execution result of the front node and the execution condition matching condition of the branch route, and a disaster recovery workflow corresponding to the service requirement can be generated according to a corresponding arrangement mode. Meanwhile, the disaster recovery workflow can be automatically triggered by setting a disaster recovery strategy, disaster recovery resources after disaster recovery and exercise are automatically cleaned, and a disaster recovery exercise report is automatically generated according to a report template example, so that the automatic and cyclic scheduling of the whole process is realized, and the effects of automatic disaster recovery and exercise are achieved.
S130, managing the disaster recovery workflow according to the hierarchical structure of the service system to be recovered.
In the embodiment of the invention, the service system to be restored for supporting the large-scale informatization service has very complex and various system architecture, deployment and application components for constructing the service system. The disaster recovery workflow can be stored and managed according to the hierarchical structure of the service system to be recovered, so that the whole service system to be recovered can be conveniently switched when the whole service system to be recovered fails, and single or partial service system components fail and are only switched as required.
According to the embodiment of the invention, the corresponding disaster recovery plan is generated according to the service requirement of the service system to be recovered, the workflow node corresponding to the disaster recovery plan is determined, the disaster recovery workflow is generated according to the configuration attribute parameters of the corresponding workflow node in the disaster recovery plan and the disaster recovery configuration, the disaster recovery workflow is managed according to the hierarchical structure of the service system to be recovered, the disaster recovery plan is managed by taking the disaster recovery workflow as a carrier, meanwhile, the disaster recovery workflow is managed according to the hierarchical structure of the service system to be recovered, the disaster recovery plans which are adaptive to each other are realized in different service systems to be recovered, different fault scenes and different granularity faults, and the disaster recovery plan which is suitable for the disaster recovery plan can be quickly selected to cope with when a certain fault occurs.
In one embodiment, the disaster recovery plan management method further comprises:
and storing each configured workflow node, and binding the workflow node with each corresponding disaster recovery workflow.
In the embodiment of the invention, after the workflow node configuration is completed, the workflow node can be saved. In the disaster recovery workflow corresponding to the disaster recovery plan of the business recovery, the business switching, simulation exercise, actual combat exercise and business return of different fault scenes of the same business system to be recovered, the workflow nodes of the link functions of different disaster recovery workflows are the same although the flow and each link function in the flow in the disaster recovery workflow are different. Each workflow node with each configuration completed can be stored, and each workflow node can be bound with each corresponding disaster recovery workflow, so that repeated workload of workflow arrangement is avoided, and convenience of arrangement is improved.
Example two
Fig. 2 is a flowchart of another disaster recovery plan management method according to a second embodiment of the present invention, which is further optimized and expanded based on the foregoing embodiment, and can be combined with various alternative solutions in the foregoing embodiment. As shown in fig. 2, the method includes:
s210, receiving service demands of the service system to be recovered, and generating workflow description information and disaster recovery configuration according to the function points of the service demands of the service system to be recovered.
Wherein the disaster recovery workflow description information comprises at least: flow logic, execution sequence and execution task information among workflow nodes; the disaster recovery configuration is the disaster recovery configuration capability of the disaster recovery plan configuration.
Wherein, the circulation logic between the workflow nodes refers to the circulation relation between each workflow node; the execution sequence refers to the execution sequence of each workflow node; the execution task information may understand the execution rule of the workflow node, and may include configuration scripts, node input parameters, and node output parameters, so that the workflow node executes according to the corresponding rule. The function point of the system to be restored can be understood as a function to be restored corresponding to the user demand.
In the embodiment of the invention, the service requirement of the service system to be recovered can be received, the functional points of the system to be recovered are abstractly determined according to the service requirement, a series of executable service flows of function processing are obtained, the circulation logic, the execution sequence and the execution task information among the workflow nodes corresponding to the functional points of the system to be recovered are generated as workflow description information, and the disaster recovery configuration is determined according to the service requirement.
S220, determining a recovery step corresponding to the system function point to be recovered in the workflow description information as a workflow node.
In the embodiment of the invention, each recovery step of the corresponding system function point to be recovered in the workflow description information can be determined as a workflow node. Each workflow node may execute independently.
S230, extracting configuration attribute parameters and disaster recovery configuration of workflow nodes in a disaster recovery plan, wherein the configuration attribute parameters at least comprise circulation logic, execution sequence and execution task information among the workflow nodes.
In the embodiment of the invention, the circulation logic, the execution sequence, the execution task information and the disaster recovery configuration among the workflow nodes of each workflow node can be extracted so as to configure each workflow node and generate a disaster recovery workflow.
S240, configuring each workflow node according to the configuration script, the node input parameters and the node output parameters in the execution task information.
In the embodiment of the invention, the configuration script, the node input parameters and the node output parameters in the execution task information can be extracted, and each workflow node is configured according to the configuration script, the node input parameters and the node output parameters, so that the workflow node realizes the self-defined recovery step function. In the actual operation process, the configuration script is in JSON or YAML format, and cross-language is supported, so that the configuration script is compatible with any service requirement.
S250, connecting the workflow nodes according to the circulation logic and the execution sequence among the workflow nodes to generate a disaster recovery workflow, and setting disaster recovery configuration for the disaster recovery workflow.
In the embodiment of the invention, after each workflow node is configured, each workflow node can be connected according to the circulation logic and the execution sequence among the workflow nodes to generate a corresponding disaster recovery workflow, and disaster recovery configuration is set for the disaster recovery workflow, so that the disaster recovery workflow realizes a corresponding disaster recovery function.
In one embodiment, connecting the workflow nodes in accordance with the flow logic and execution order between the workflow nodes generates a disaster recovery workflow comprising:
Determining gateway nodes corresponding to the workflow nodes according to the circulation logic among the workflow nodes; wherein the gateway node comprises at least one of: parallel gateway node, parallel synchronous gateway node, conditional gateway node, multiple selection gateway node and event gateway;
and connecting each workflow node with the gateway node corresponding to the workflow node according to the execution sequence to generate a disaster recovery workflow.
A parallel gateway node is understood to mean that several nodes of the gateway egress connection can execute in parallel. A parallel synchronous gateway node may be understood as a process that may be advanced after all of the plurality of nodes connected to the gateway portal have been executed. A conditional gateway node is understood to mean each node connected to the gateway outlet, which has an associated execution condition, only nodes meeting the condition being executed. These conditions are mutually exclusive and cannot be met at the same time. A multiplexing gateway node is understood to mean each node connected to the gateway outlet, which has an associated execution condition, only nodes meeting the condition being executed. The conditions are not mutually exclusive, and at the same time, a plurality of conditions are possibly met, so that a plurality of nodes can be executed, meanwhile, the multi-path merging gateway node can be matched with the multi-path selecting gateway node for use, and the flow can be advanced backwards only after all the nodes which are executed and meet the conditions are executed. An event gateway may understand that a flow may only advance back when a specified event or message is received.
In the embodiment of the invention, the gateway nodes corresponding to the workflow nodes can be determined according to the circulation logic among the workflow nodes, and the workflow nodes and the gateway nodes corresponding to the workflow nodes can be connected according to the execution sequence to generate the disaster recovery workflow. In the actual operation process, the execution sequence among the workflow nodes can be connected through arrows, and the association relation of the upstream node and the downstream node is recorded. When the corresponding gateway node is needed, the workflow node and the corresponding gateway node are connected through the arrow, so that the scheduling efficiency and scheduling judgment logic are ensured, and the parallel arrangement and conditional branch arrangement of the disaster recovery workflow are controlled through the gateway node.
In an embodiment, determining a gateway node corresponding to each workflow node according to a flow logic between the workflow nodes includes:
when the circulation logic between at least two workflow nodes is parallel, determining the gateway node corresponding to the parallel workflow node as a parallel gateway node;
when the circulation logic between at least two workflow nodes is parallel and the workflow nodes enter the next workflow node after the execution of the workflow nodes is completed, determining the gateway node corresponding to the workflow nodes as a parallel synchronous gateway node;
When the circulation logic between at least two workflow nodes is the workflow nodes with the associated first execution conditions and the workflow nodes meeting the first execution conditions are executed, determining the gateway node corresponding to the workflow nodes as a conditional gateway node; wherein each first execution condition is a mutual exclusion condition;
when the circulation logic between at least two workflow nodes is the workflow nodes with the associated second execution conditions and the workflow nodes meeting the second execution conditions are executed, determining the gateway node corresponding to the workflow nodes as a multipath selection gateway node; wherein each second execution condition is a non-exclusive condition;
when the workflow node receives a preset instruction and executes the instruction, determining the gateway node corresponding to the workflow node as an event node.
The first execution conditions are mutually exclusive conditions, and the first execution conditions corresponding to the workflow nodes can be considered to be mutually exclusive and cannot be met simultaneously. The second execution condition is a non-exclusive condition, and it is understood that at the same time, it is possible to satisfy a plurality of second execution conditions, so that a plurality of workflow nodes will be executed.
In the embodiment of the invention, when the circulation logic between two workflow nodes is only parallel, the gateway node corresponding to the parallel workflow node can be determined to be the parallel gateway node. When the flow logic between at least two workflow nodes is parallel and the workflow nodes enter the next workflow node after the execution of the workflow nodes is completed, the gateway node corresponding to the workflow nodes can be determined to be a parallel synchronous gateway node. When the circulation logic among the workflow nodes is the workflow nodes with the associated first execution conditions and the workflow nodes meeting the first execution conditions are executed, the gateway node corresponding to the workflow nodes is determined to be a conditional gateway node. When the circulation logic among the workflow nodes is the workflow nodes with the associated second execution conditions, and the workflow nodes meeting the second execution conditions execute, determining the gateway node corresponding to the workflow nodes as a multi-path selection gateway node, and when the workflow nodes receive a preset instruction and execute, determining the gateway node corresponding to the workflow nodes as an event node. And selecting gateway nodes corresponding to the workflow nodes according to the circulation logic among the workflow nodes, and realizing three main workflow scheduling modes of serial, parallel and conditional branches according to the corresponding gateway nodes.
S260, determining the hierarchical structure of the service system to be recovered, and extracting disaster recovery workflows belonging to the same service system to be recovered.
In the embodiment of the invention, the hierarchical structure can be determined according to the system architecture of the service system to be recovered and belongs to the disaster recovery workflow of the same service system to be recovered. For example, the same service system to be restored may include a plurality of application components, each of which may further correspond to a sub-application component, and accordingly, the service system to be restored, each application component, and each sub-application component are respectively located in different hierarchies.
S270, storing and managing disaster recovery workflows belonging to the same service system to be recovered according to the hierarchical structure of the corresponding service system to be recovered.
In the embodiment of the invention, the disaster recovery workflow belonging to the same service system to be recovered can be stored and managed according to the hierarchical structure of the corresponding service system to be recovered, so that the disaster recovery plan can be managed by taking the disaster recovery workflow as a carrier.
According to the embodiment of the invention, the workflow description information and the disaster recovery configuration are generated according to the functional points of the system to be recovered of the service requirement by receiving the service requirement of the service system to be recovered, the recovery steps corresponding to the functional points of the system to be recovered in the workflow description information are determined as workflow nodes, the configuration attribute parameters and the disaster recovery configuration of the workflow nodes in the disaster recovery plan are taken, each workflow node is configured according to the configuration script, the node input parameters and the node output parameters in the execution task information, and the respective configuration of the workflow nodes is realized. The disaster recovery workflow is generated by connecting the workflow nodes according to the circulation logic and the execution sequence among the workflow nodes, and disaster recovery configuration is set for the disaster recovery workflow, so that the automatic generation of the disaster recovery workflow is realized. The method and the system have the advantages that the hierarchical structure of the service system to be recovered is determined, the disaster recovery workflow belonging to the same service system to be recovered is extracted, the disaster recovery workflow belonging to the same service system to be recovered is stored and managed according to the hierarchical structure of the corresponding service system to be recovered, the whole service system to be recovered is conveniently switched when the whole service system to be recovered fails, single or partial service system component fails, and the service system component is only switched as required, so that the use experience of a user is improved.
Example III
In this embodiment, on the basis of the foregoing embodiment, a workflow is taken as a disaster recovery workflow, a node is taken as a workflow node, an automation orchestration engine is configured to generate the disaster recovery workflow, and an application system is taken as a service system to be recovered as an example, so as to further describe a disaster recovery plan management method.
And the disaster recovery plan is used as an emergency plan formulated by various fault scenes of the application system. The emergency scene is used for service takeover to ensure service continuity, is daily used for simulating the recovery capability of a disaster recovery plan and improving the response capability of personnel, and needs to restore data and service back to the production system after service takeover and actual combat drilling. Therefore, the flows of the disaster recovery plans in the different scenarios are different, and specifically, link functions, execution timing and scheduling models for carrying service demands in each flow link are different. However, it is not straightforward to solidify the flow of disaster recovery plans for the different scenarios described above, as the link functions involved in the different flows are many identical.
The link functions involved in the disaster recovery planning process need to be modified in its entirety, and abstractions need to be made in order to avoid large amounts of custom development work and minor changes. The capability of the existing disaster recovery system and other business continuity are utilized to ensure the existing mature functions of the system, and the capability of developing new functions as required and providing functions required by disaster recovery planning in a concurrent mode.
Generally, link functions abstracted from the flow of disaster recovery planning include: recovery policies, data recovery, application configuration modifications, application service restart, validation (including application availability validation, data consistency validation), recovery resource clean-up, traffic flow switchover, generation of disaster recovery reports, and the like. In the method, the functions are subjected to strategy configuration according to the business requirements of disaster recovery plans of different fault scenes.
The policy configuration provides disaster recovery policies including automatic triggering after the backup/copying is completed and periodic automatic triggering, wherein the periodic automatic triggering includes periodic selection of latest backup/copying data triggering, periodic selection of batch automatic triggering of all backup/copying data in a certain time period, and the periodic triggering is sensitive if configuration is met, and automatic delay triggering is required to exclude time, and the like, so that the disaster recovery policies meeting different requirements are provided.
Disaster recovery is configured as an application component of an application system, such as a file, database, middleware, host, etc., that provides for data recovery/mount recovery of backup/replication data, including recovery of data into disaster recovery resources.
The application configuration provides for modifying the configuration file content of the application system/application system component configuration file, including the format of ini/xml/json, and restarting the process/service/network of the application system/application component, effecting the configuration modification.
The verification configuration provides for post-recovery data consistency verification and application availability verification for application systems/application system components in the disaster recovery resources after service takeover or exercise. Verifying consistency of the restored data, and verifying whether the data fingerprints of the production system data are consistent with those of the data in the disaster recovery resources after the restoration/copying; application availability verification, namely verifying application systems/application components in the recovered/copied disaster recovery resources, and enabling the application software to normally open, wherein the service state, the port monitoring state, the process state, the host network, the HTTP API response state and the like are used for verifying.
The cleanup configuration provides for the deletion of physical data of the application system/application system component and the restoration of the state of the application system/application system component.
The network configuration provides for configuring the network configuration of the host network card, the switch and the router, and configuring the mapping relation between the domain name of the DNS server and the IP.
The virus killing configuration provides virus scanning, virus file repair, virus deletion, virus file isolation for the recovered/copied data, and parallel virus killing control for the configuration of the number of instances of the killing engine.
The report configuration provides the self-definition of the content of the disaster recovery report template, records the content including data recovery, data consistency verification/application availability verification after recovery, recovery resource cleaning, virus killing condition, RTO, RPO, service switching/drilling result and the like, and can generate and send mails.
The script configuration provides standard script execution and response return formats, provides metadata management of parameters related to the script, sets initialization parameters and can support adding and deleting parameters according to needs, and the complete functions which are not applicable to the configuration are configured by writing the script.
After the complete function abstraction is carried out on each link of the flow of the disaster recovery plan and the independent configuration is carried out, the independent function configuration is required to be automatically arranged according to the disaster recovery business requirement to generate orderly executable workflow, and the orderly executable workflow can be scheduled and executed according to the arranged workflow time sequence. In one embodiment, the process scheduling and execution of the disaster recovery plan may be performed by an automated orchestration engine.
In one embodiment, FIG. 3 is a schematic diagram of a disaster recovery workflow generation provided in accordance with a third embodiment of the present invention. The scheduling engine is an engine for scheduling workflow scheduling and executing, and the engine provides automatic scheduling capability in combination with the workflow, so that disaster recovery process automation is realized, timely service taking over when a disaster occurs is ensured, meanwhile, the working efficiency is improved, and the human resource investment is reduced.
The orchestration engine provides three main orchestration scheduling models of serial, parallel and conditional branches, which are respectively applicable to disaster recovery of different application systems.
According to the modeling model of the application system and the fault scene, a disaster recovery plan workflow conforming to the scheduling model of the application system is automatically arranged, a disaster recovery strategy is set for the workflow to automatically trigger, disaster recovery resources after disaster recovery and exercise are automatically cleaned, a disaster recovery exercise report is automatically generated according to a report template example, and automatic and cyclic scheduling of the whole process is realized, so that the effect of automatic disaster recovery and exercise is achieved.
Each link of the flow of the disaster recovery plan is called a node. The nodes use the independent complete functionality of the disaster recovery configuration described above. The method comprises the steps of ensuring the capability of the existing mature functions of the existing disaster recovery system and other business continuity, developing new functions according to requirements, and requiring scheduling rules of execution conditions, data interaction and flow direction among nodes in a workflow. Thus, there is a need for self-attribute definition of the node.
In one embodiment, fig. 4 is a schematic diagram of a workflow node configuration according to a third embodiment of the present invention. The method can define the upstream node of the node, the input parameters of the upstream node, the service parameters, the execution conditions, the capability list and the like, and realize the self-defined functions according to the node input, and the abnormal processing, the output parameters and the circulation rules of the function output. The nodes can be divided into existing service nodes which complete a certain mature function and fully customized script nodes according to the implementation mode, any service requirements can be met, the service parameters support the use of JSON or YAML formats, cross-language support is achieved, and therefore the service parameters of any service requirements are compatible. The specific functions processed by the node can be clarified and the node has openness as long as the node definition requirements are met.
In order to meet the service requirements of the disaster recovery plan, the execution sequence among the nodes is controlled by arrow connection and recording the association relation of the upstream node and the downstream node. However, in order to ensure the scheduling efficiency and scheduling judgment logic, the parallel arrangement and conditional branch arrangement of the workflow need to be controlled by the gateway node.
In one embodiment, FIG. 5 is a schematic diagram of a disaster recovery workflow generation provided in accordance with a third embodiment of the present invention.
Wherein the gateway node definition is as shown in table 1:
TABLE 1
/>
According to the service requirements of disaster recovery plan workflows of different fault scenes of the service system and the definition of nodes, any workflow of a directed acyclic graph with one inlet and one outlet can be arranged in a drag mode in a programming canvas, and the execution sequence is determined through unidirectional arrow connection. And through the capabilities of the node gateways of various types in the graph, a plurality of scheduling models of sequential serial and parallel initiation and synchronous waiting for execution completion, conditional branch scheduling and event triggering scheduling are provided, and the performance of scheduling execution is improved on the premise of meeting the requirements of different service scenes.
In one embodiment, the service system supporting the large-scale informatization service has very complex and various system architecture, deployment and application components for constructing the service system. The system architecture is currently and mainly upgraded from a single-node or double-rack architecture to a distributed and clustered architecture with better performance and safety; in the deployment level, the difference of different service systems is very large, including physical server deployment, virtualization deployment, cloud protozoon deployment and the like; the application components that build the business system typically include application software, load balancers, middleware, unstructured and structured databases of persistent data, operating systems, and the like.
In one embodiment, to save disaster recovery costs, disaster recovery resources for disaster recovery are not put into 1:1 with the production end.
In one embodiment, FIG. 6 is a schematic diagram of managing disaster recovery plans according to a hierarchy provided in accordance with a third embodiment of the invention. And if the whole service system fails, the whole service system is switched, and single or partial service system components fail and are only switched as required. Therefore, different business systems, different fault scenes and different granularity faults all have self-adaptive disaster recovery plans, so that the corresponding disaster recovery plan entries are very large for the business systems supporting large-scale informatization businesses.
The disaster recovery plans are numerous, and the service system needs to be orderly grouped and managed according to layering and fault division scenes, so that the management complexity is reduced, and the management is clear. In the event of a failure of some sort, an adaptive disaster recovery plan can be quickly selected to cope with.
In one embodiment, fig. 7 is a schematic diagram of a workflow node binding according to a third embodiment of the present invention. In an embodiment, the node functions may be individually configured to manage, multiplex, different workflows: in the workflow corresponding to the disaster recovery plan of the business switching, simulation exercise, actual combat exercise and business returning in different fault scenes of the same application system, the workflow process is different from each link function in the process, but the nodes of the link functions of different workflows are the same. Therefore, in order to avoid the repeated workload of workflow orchestration, and to improve the convenience of orchestration, it is necessary to provide a method of independently managing the configuration of nodes and binding in workflow orchestration.
In one embodiment, FIG. 8 is a schematic diagram of a disaster recovery workflow replication provided in accordance with a third embodiment of the present invention. In general, in workflows corresponding to disaster recovery plans of service switching, simulation drilling, actual combat drilling and service returning in different fault scenes of the same application system, different workflows are approximately consistent in flow, similar workflows are rearranged, workload is repeated, and accordingly heavy work and easy errors are caused. Therefore, the existing workflow needs to be duplicated, only part of node configurations are modified, and disaster recovery plans corresponding to other scenes are quickly generated. The system has the capability of automatically arranging and generating the workflow, and improves the arranging efficiency.
In one embodiment, the present invention is described in detail below by taking a warehouse management system (Warehouse Management System, WMS) in the manufacturing industry as an example, but the scope of the present invention is not limited to the following embodiments.
The WMS system comprises the following components: application software-wms, middleware-nginx, elsticsearch, redis, file server, relational database-mysql.
In disaster recovery plan grouping management, the primary catalog may be defined as a DRP-WMS system, and its secondary catalog is defined as DRP-application software-WMS, DRP-middleware-nginx, DRP-elsticsearch, DRP-redis, DRP-file server, DRP-relational database-mysql, respectively.
In the first-level menu and the second-level menu of the packet management, workflow names of service taking over, simulated drilling, actual combat drilling and service returning are defined in each DRP-x respectively so as to respectively correspond to disaster recovery plans of different fault scenes of the WMS system.
According to RTO and RPO requirements of the WMS system, determining a data protection technology of the WMS system, adopting a continuous data protection (Continuous Data Protection, CDP) technology based on log analysis for a database, meeting high RTO requirements besides meeting data consistency requirements, adopting a complete machine CDP technology for redis, and adopting a virtualized data backup technology for causing data variable quantity to other application system components.
The platforms required for disaster recovery, also known as disaster recovery resources, are determined using the VMWare virtualization platform as a disaster recovery resource. By using the data sandbox technology which is popular in the industry, an isolation environment is deployed on the VMWare virtualization platform for service taking over and exercise, so that the conflict between a disaster recovery machine and a production machine IP during exercise is mainly avoided, and the take-over time is too long because the disaster recovery machine IP, a gateway and a subnet mask are required to be modified in order to solve the problem that the disaster recovery machine IP, the gateway and the subnet mask are not matched with the gateway and the subnet mask of the VMWare virtualization platform virtual switch during taking over, and the disaster recovery machine IP and the IP of an application component configuration file deployed by the disaster recovery machine IP are required to be modified. And respectively creating data protection tasks for application components of the WMS system in the disaster recovery system.
Taking the disaster recovery plan of the whole WMS system for business take-over, simulation exercise, actual combat exercise and business return as an example, the steps are described, and the disaster recovery plan of the application component can be referred.
The whole business of the WMS system takes over the arranging workflow, and the node functions and the scheduling sequence are as follows: disaster recovery resources are selected and accessed into a data sandbox, and application software-wms, middleware-nginx, elsticsearch, redis, file servers and relational database-mysql data recovery are performed. And sequentially pulling up application component services, verifying service availability of each application component, and mapping the IP of an Nginx host for providing services for the WMS system to the outside in a data sandbox into an externally accessible IP through a network address translation technology (Network Address Translation, NAT). And (3) switching the DNS-IP mapping relation of the WMS system from DNS- > original production machine Nginx IP to mapping IP in a DNS- > data sandbox. Methods, implementations, and results for generating disaster recovery reports for recording disaster recovery plans facilitate retrospection.
The whole simulation exercise of WMS system arranges workflow, node function and dispatch order are: the disaster recovery exercise resource is selected and accessed into a data sandbox, and the application software-wms, the middleware-nginx, elsticsearch, redis, the file server and the relational database-mysql data are recovered. The application component services are pulled up in turn. The IP of the nginnx host providing services to the WMS outside in the data sandbox is mapped to an externally accessible IP by NAT technology. And (3) switching the DNS-IP mapping relation of the WMS system from DNS- > original production machine Nginx IP to mapping IP in a DNS- > data sandbox. The https response status of the externally provided access is verified. And cleaning the drilling resources. Methods, implementations, and results for generating disaster recovery exercises reports for recording disaster recovery exercises facilitate retrospection. Policies are added to the automated workflow execution.
The whole business returning of the WMS system is provided with a workflow, and the node functions and the scheduling sequence are as follows: and returning the data. The application component services are pulled up in turn. The https response status of the externally provided access is verified. And switching the mapping relation between the DNS and the IP of the WMS system from the mapping IP in the DNS- > data sandbox to the DNS- > original producer Nginx IP. And a back migration report is generated, so that the tracing is convenient.
When different disaster scenes occur, disaster recovery plans with different granularities can be selected for service emergency takeover, the matched disaster recovery plans are pre-arranged for various fault scenes of the service system by utilizing the flexible capability of an automatic arrangement engine through analyzing the fault scenes of the service system, the common problems of the different disaster recovery plans are solved during arrangement, the arrangement efficiency is improved, a large number of disaster recovery plans are classified and managed in a layering mode, and the management difficulty of the disaster recovery plans after arrangement is reduced.
Example IV
Fig. 9 is a schematic structural diagram of a disaster recovery plan management device according to a fourth embodiment of the present invention. As shown in fig. 9, the apparatus includes: a plan generation module 91, a workflow generation module 92 and a workflow management module 93.
The plan generating module 91 is configured to generate a corresponding disaster recovery plan according to a service requirement of a service system to be recovered, and determine a workflow node corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration.
The workflow generation module 92 is configured to generate a disaster recovery workflow according to the configuration attribute parameters of the corresponding workflow nodes in the disaster recovery plan and the disaster recovery configuration.
The workflow management module 93 is configured to manage the disaster recovery workflow according to a hierarchical structure of the service system to be recovered.
According to the embodiment of the invention, the corresponding disaster recovery plan is generated according to the service requirement of the service system to be recovered through the plan generation module, the workflow nodes corresponding to the disaster recovery plan are determined, the workflow generation module generates the disaster recovery workflow according to the configuration attribute parameters of the corresponding workflow nodes in the disaster recovery plan and the disaster recovery configuration, the workflow management module manages the disaster recovery workflow according to the hierarchical structure of the service system to be recovered, the disaster recovery plan is managed by taking the disaster recovery workflow as a carrier, meanwhile, the disaster recovery workflow is managed according to the hierarchical structure of the service system to be recovered, and the disaster recovery plans which are adaptive to the disaster recovery system are realized in different service systems to be recovered, different fault scenes and different granularity faults are realized, so that the disaster recovery plan which is adaptive to the disaster recovery system can be quickly selected to cope with the occurrence of a certain fault.
In one embodiment, the plan generation module 91 includes:
the demand receiving unit is used for receiving the service demand of the service system to be recovered, and generating workflow description information and disaster recovery configuration according to the functional points of the service demand of the system to be recovered;
wherein the disaster recovery workflow description information comprises at least: flow logic, execution sequence and execution task information among workflow nodes; disaster recovery configuration is the disaster recovery configuration capability of the disaster recovery plan configuration;
and the node determining unit is used for determining a recovery step corresponding to the system function point to be recovered in the workflow description information as a workflow node.
In one embodiment, the workflow generation module 92 includes:
the information extraction unit is used for extracting configuration attribute parameters of the workflow nodes in the disaster recovery plan and disaster recovery configuration, wherein the configuration attribute parameters at least comprise circulation logic, execution sequence and execution task information among the workflow nodes;
the node configuration unit is used for configuring each workflow node according to the configuration script, the node input parameters and the node output parameters in the execution task information;
and the workflow generating unit is used for connecting the workflow nodes according to the circulation logic and the execution sequence among the workflow nodes to generate a disaster recovery workflow and setting disaster recovery configuration for the disaster recovery workflow.
In an embodiment, the workflow generating unit is specifically configured to:
determining gateway nodes corresponding to the workflow nodes according to the circulation logic among the workflow nodes; wherein the gateway node comprises at least one of: parallel gateway node, parallel synchronous gateway node, conditional gateway node, multiple selection gateway node and event gateway;
and connecting each workflow node with the gateway node corresponding to the workflow node according to the execution sequence to generate a disaster recovery workflow.
In an embodiment, the workflow generating unit is further configured to:
when the circulation logic between at least two workflow nodes is parallel, determining the gateway node corresponding to the parallel workflow node as a parallel gateway node;
when the circulation logic between at least two workflow nodes is parallel and the workflow nodes enter the next workflow node after the execution of the workflow nodes is completed, determining the gateway node corresponding to the workflow nodes as a parallel synchronous gateway node;
when the circulation logic between at least two workflow nodes is the workflow nodes with the associated first execution conditions and the workflow nodes meeting the first execution conditions are executed, determining the gateway node corresponding to the workflow nodes as a conditional gateway node; wherein each first execution condition is a mutual exclusion condition;
When the circulation logic between at least two workflow nodes is the workflow nodes with the associated second execution conditions and the workflow nodes meeting the second execution conditions are executed, determining the gateway node corresponding to the workflow nodes as a multipath selection gateway node; wherein each second execution condition is a non-exclusive condition;
when the workflow node receives a preset instruction and executes the instruction, determining the gateway node corresponding to the workflow node as an event node.
In one embodiment, the workflow management module 93 includes:
the workflow extraction unit is used for determining the hierarchical structure of the service system to be recovered and extracting disaster recovery workflows belonging to the same service system to be recovered;
and the workflow management unit is used for storing and managing the disaster recovery workflows belonging to the same service system to be recovered according to the hierarchical structure of the corresponding service system to be recovered.
In one embodiment, the disaster recovery plan management device further comprises:
and the node binding module is used for storing each workflow node with each configured, and binding the workflow node with each corresponding disaster recovery workflow.
The disaster recovery plan management device provided by the embodiment of the invention can execute the disaster recovery plan management method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 10 is a schematic structural diagram of an electronic device implementing a disaster recovery plan management method according to an embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 10, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the disaster recovery plan management method.
In some embodiments, the disaster recovery plan management method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the disaster recovery plan management method described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the disaster recovery plan management method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chips (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of disaster recovery plan management, comprising:
generating a corresponding disaster recovery plan according to service demands of a service system to be recovered, and determining workflow nodes corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration;
generating a disaster recovery workflow according to the configuration attribute parameters corresponding to the workflow nodes in the disaster recovery plan and the disaster recovery configuration;
And managing the disaster recovery workflow according to the hierarchical structure of the service system to be recovered.
2. The method of claim 1, wherein the generating a corresponding disaster recovery plan according to the service requirements of the service system to be recovered, and determining the workflow node corresponding to the disaster recovery plan, comprises:
receiving the service requirement of the service system to be recovered, and generating the workflow description information and the disaster recovery configuration according to the functional point of the system to be recovered of the service requirement;
wherein the disaster recovery workflow description information at least comprises: flow logic, execution sequence and execution task information among workflow nodes; the disaster recovery configuration is a disaster recovery configuration capability of the disaster recovery plan configuration;
and determining a recovery step corresponding to the system function point to be recovered in the workflow description information as the workflow node.
3. The method of claim 1, wherein the generating a disaster recovery workflow in accordance with the configuration attribute parameters for the workflow nodes within the disaster recovery plan and the disaster recovery configuration comprises:
Extracting configuration attribute parameters of workflow nodes in the disaster recovery plan and the disaster recovery configuration, wherein the configuration attribute parameters at least comprise circulation logic, execution sequence and execution task information among the workflow nodes;
configuring each workflow node according to the configuration script, the node input parameters and the node output parameters in the execution task information;
and connecting the workflow nodes according to the circulation logic among the workflow nodes and the execution sequence to generate a disaster recovery workflow, and setting disaster recovery configuration for the disaster recovery workflow.
4. The method of claim 3, wherein said concatenating each of said workflow nodes in accordance with said execution order and flow logic between said workflow nodes to generate a disaster recovery workflow comprises:
determining gateway nodes corresponding to the workflow nodes according to the circulation logic among the workflow nodes; wherein the gateway node comprises at least one of: parallel gateway node, parallel synchronous gateway node, conditional gateway node, multiple selection gateway node and event gateway;
And connecting each workflow node with the gateway node corresponding to the workflow node according to the execution sequence to generate a disaster recovery workflow.
5. The method of claim 4, wherein determining the gateway node corresponding to each of the workflow nodes according to the flow logic between the workflow nodes comprises:
when the circulation logic between at least two workflow nodes is parallel, determining that the gateway node corresponding to the parallel workflow nodes is a parallel gateway node;
when the flow logic between at least two workflow nodes is parallel and the workflow nodes enter the next workflow node after the execution of the workflow nodes is completed, determining the gateway node corresponding to the workflow nodes as a parallel synchronous gateway node;
when the circulation logic between at least two workflow nodes is the first execution condition with correlation, and the workflow nodes meeting the first execution condition execute, determining the gateway node corresponding to the workflow node as a conditional gateway node; wherein each of the first execution conditions is a mutually exclusive condition;
when the circulation logic between at least two workflow nodes is the second execution condition with correlation, and the workflow nodes meeting the second execution condition execute, determining the gateway node corresponding to the workflow node as a multipath selection gateway node; wherein each of the second execution conditions is a non-exclusive condition;
And when the workflow node receives a preset instruction and executes the instruction, determining the gateway node corresponding to the workflow node as an event node.
6. The method of claim 1, wherein managing the disaster recovery workflow in accordance with a hierarchy of the business system to be recovered comprises:
determining the hierarchical structure of the service system to be recovered, and extracting the disaster recovery workflow belonging to the same service system to be recovered;
and storing and managing the disaster recovery workflows belonging to the same service system to be recovered according to the hierarchical structure of the corresponding service system to be recovered.
7. The method according to claim 1, characterized in that the method further comprises:
storing each configured workflow node, and binding the workflow node with each corresponding disaster recovery workflow.
8. A disaster recovery plan management system, comprising:
the system comprises a plan generation module, a disaster recovery module and a workflow node generation module, wherein the plan generation module is used for generating a corresponding disaster recovery plan according to the service requirement of a service system to be recovered and determining the workflow node corresponding to the disaster recovery plan; wherein the disaster recovery plan includes at least workflow description information and disaster recovery configuration;
A workflow generating module, configured to generate a disaster recovery workflow according to the configuration attribute parameters corresponding to the workflow nodes in the disaster recovery plan and the disaster recovery configuration;
and the workflow management module is used for managing the disaster recovery workflow according to the hierarchical structure of the service system to be recovered.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the disaster recovery plan management method of any of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to implement the disaster recovery plan management method of any one of claims 1-7 when executed.
CN202311780719.7A 2023-12-22 2023-12-22 Disaster recovery plan management method, system, electronic equipment and storage medium Pending CN117743033A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311780719.7A CN117743033A (en) 2023-12-22 2023-12-22 Disaster recovery plan management method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311780719.7A CN117743033A (en) 2023-12-22 2023-12-22 Disaster recovery plan management method, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117743033A true CN117743033A (en) 2024-03-22

Family

ID=90254377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311780719.7A Pending CN117743033A (en) 2023-12-22 2023-12-22 Disaster recovery plan management method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117743033A (en)

Similar Documents

Publication Publication Date Title
US11070444B2 (en) SDN control plane performance testing
CN103778031B (en) Distributed system multilevel fault tolerance method under cloud environment
US8352801B2 (en) Systems, methods, and apparatus to debug a network application by utilizing a cloned network and an interactive debugging technique
CN108270726B (en) Application instance deployment method and device
CN112667362B (en) Method and system for deploying Kubernetes virtual machine cluster on Kubernetes
CN110740053B (en) Service arranging method and device
CN107508722B (en) Service monitoring method and device
CN111158708A (en) Task arrangement engine system
Gonzalez et al. Service availability in the NFV virtualized evolved packet core
CN108989134B (en) SDN-based virtualized network data plane configuration recovery system and method
Veeraraghavan et al. Maelstrom: Mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently
CN109189758B (en) Operation and maintenance flow design method, device and equipment, operation method, device and host
CN110008005B (en) Cloud platform-based power grid communication resource virtual machine migration system and method
WO2023138014A1 (en) Intelligent operation and maintenance system oriented to computing-network integration scenario and use method thereof
CN106130763A (en) Server cluster and be applicable to the database resource group method for handover control of this cluster
CN116701043B (en) Heterogeneous computing system-oriented fault node switching method, device and equipment
CN108319492A (en) Reset the method, apparatus and system of physical machine
CN116400987A (en) Continuous integration method, device, electronic equipment and storage medium
CN117743033A (en) Disaster recovery plan management method, system, electronic equipment and storage medium
CN107566175A (en) The method of automatic deployment oracle rac environmental variances
CN114756301A (en) Log processing method, device and system
CN106789380A (en) A kind of virtual machine network integration supervisory systems
CN112990744B (en) Automatic operation and maintenance method and device for massive million-level cloud equipment
Ledmi et al. Fault tolerance in cloud computing: A survey
CN115242596B (en) User-oriented network test bed scene service scheduling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination