CN116193384A - Disaster recovery switching method, system, electronic equipment and storage medium - Google Patents

Disaster recovery switching method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN116193384A
CN116193384A CN202111421670.7A CN202111421670A CN116193384A CN 116193384 A CN116193384 A CN 116193384A CN 202111421670 A CN202111421670 A CN 202111421670A CN 116193384 A CN116193384 A CN 116193384A
Authority
CN
China
Prior art keywords
disaster recovery
network element
switching
workflow
main network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111421670.7A
Other languages
Chinese (zh)
Inventor
孙勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202111421670.7A priority Critical patent/CN116193384A/en
Priority to PCT/CN2022/126000 priority patent/WO2023093379A1/en
Publication of CN116193384A publication Critical patent/CN116193384A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • H04L45/247Multipath using M:N active or standby paths
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/04Error control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/20Services signaling; Auxiliary data signalling, i.e. transmitting data via a non-traffic channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/50Service provisioning or reconfiguring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/90Services for handling of emergency or hazardous situations, e.g. earthquake and tsunami warning systems [ETWS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/18Processing of user or subscriber data, e.g. subscribed services, user preferences or user profiles; Transfer of user or subscriber data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Emergency Management (AREA)
  • Environmental & Geological Engineering (AREA)
  • Public Health (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to the field of communications technologies, and in particular, to a disaster recovery switching method, a system, an electronic device, and a storage medium. The disaster recovery switching method comprises the following steps: acquiring disaster recovery monitoring data of a main network element; processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction; when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library; and operating the disaster recovery switching workflow to finish the disaster recovery switching of the main network element. The method and the device can realize automatic disaster recovery switching and improve the speed and accuracy of judging the disaster recovery switching of the main network element; the technical problems that in the prior art, because disaster recovery switching is carried out by manual operation, the complexity of a disaster recovery switching command and human factors often cause long operation time and poor accuracy are solved.

Description

Disaster recovery switching method, system, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of communication, in particular to a disaster recovery switching method, a system, electronic equipment and a storage medium.
Background
With the development of 5G technology, the requirements of users on communication quality are increasingly high, and operators are also required to be able to respond to migration services rapidly under abnormal conditions. The disaster recovery switching of the network element is to switch the business of the object to the standby object when the object is abnormal, and the goal of the disaster recovery switching is to make the business smoothly transferred successfully, so as to reduce the influence on the use of the user as much as possible.
However, the backup strategies of the current network element deployment in the disaster recovery switching service include 1+1 main and standby, 1+1 mutual and standby, pool mutual and n+1 main and standby, when the network element has abnormal conditions, the switching process needs to be performed by means of manual operation, the whole process needs frequent man-machine interaction, and the problems of long operation time, poor accuracy and the like often occur due to the complexity of the disaster recovery switching command and human factors.
Disclosure of Invention
The main objective of the embodiments of the present application is to provide a disaster recovery switching method, a system, an electronic device, and a storage medium. The method aims to realize automatic disaster recovery switching and improve the speed and accuracy of judging the disaster recovery switching of the main network element.
In order to achieve the above objective, an embodiment of the present application provides a disaster recovery switching method, including: acquiring disaster recovery monitoring data of a main network element; processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction; when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library; and operating the disaster recovery switching workflow to finish the disaster recovery switching of the main network element.
In order to achieve the above objective, an embodiment of the present application further provides a disaster recovery switching system, including: the first acquisition module is used for acquiring disaster recovery monitoring data of the main network element; the decision module is used for processing the disaster recovery monitoring data by utilizing a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction; the second obtaining module is used for obtaining the disaster recovery switching workflow of the main network element from a preset workflow library when the disaster recovery decision instruction is disaster recovery switching trigger; and the switching module is used for operating the disaster recovery switching workflow and finishing the disaster recovery switching of the main network element.
To achieve the above object, an embodiment of the present application further provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, where the instructions are executed by the at least one processor, so that the at least one processor can execute the disaster recovery switching method.
In order to achieve the above objective, an embodiment of the present application further provides a computer readable storage medium storing a computer program, where the computer program implements the disaster recovery switching method described above when executed by a processor.
In the disaster recovery switching method provided by the application, disaster recovery monitoring data of a main network element are obtained in the process of carrying out disaster recovery switching on the main network element; processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction; when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library; the disaster recovery switching workflow is operated to complete the disaster recovery switching of the main network element; the disaster recovery decision model established based on the decision tree is utilized to carry out disaster recovery judgment on the disaster recovery monitoring data of the main network element, and the decision tree analysis can carry out quick and feasible and good-effect results on the data source in a relatively short time, so that the speed and accuracy of disaster recovery switching judgment of the application can be improved; meanwhile, the disaster recovery flow of the main network element is abstracted into steps according to the workflow principle, after the disaster recovery operation is triggered after the analysis of the decision tree, the whole process control is carried out through the workflow, so that human intervention is not needed when the disaster recovery switching is carried out, and the automatic disaster recovery switching is realized; the technical problems that in the prior art, because disaster recovery switching is carried out by manual operation, the complexity of a disaster recovery switching command and human factors often cause long operation time and poor accuracy are solved.
Drawings
FIG. 1 is a schematic diagram of an application environment according to an embodiment of the present application
Fig. 2 is a flowchart of a disaster recovery switching method provided in an embodiment of the present application;
fig. 3 is a flowchart of a method for generating a disaster recovery decision model in a disaster recovery switching method according to an embodiment of the present application;
fig. 4 is a flowchart of a method for generating a disaster recovery switching workflow in the disaster recovery switching method according to the embodiment of the present application;
fig. 5 is a flowchart of a disaster recovery switching method provided in an embodiment of the present application;
fig. 6 is a flowchart of a disaster recovery switching method provided in an embodiment of the present application;
fig. 7 is a schematic flow diagram of a disaster recovery switching system according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. However, as will be appreciated by those of ordinary skill in the art, in the various embodiments of the present application, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not be construed as limiting the specific implementation of the present application, and the embodiments may be mutually combined and referred to without contradiction.
The schematic structural diagram of the application environment of the application is shown in fig. 1, and specifically includes a disaster recovery management center, a data center, a main network element, a standby network element, and the like. The disaster recovery management center (Diaster Recovery Management Center, abbreviated as DRMC) supports disaster recovery management for all types of network elements, and can cover various disaster recovery scenarios, such as 1+1 mutual backup, n+1 master backup, POOL networking, etc., including but not limited to disaster recovery automatic monitoring discovery of the master network element and workflow driven disaster recovery flow management; data Center (DC): the operator establishes a data center in a management area by dividing the equipment in the management area, so that the related equipment can be conveniently and intensively managed, and the DC with regional isolation in disaster tolerance response are objects of disaster recovery. DC-A represents the datse:Sub>A center of region A and DC-B represents the datse:Sub>A center of region B; both the primary network element and the standby network element comprise two types, namely a virtual network element device instance (Virtual Network Function Instance, abbreviated as VNF) and a physical network element device instance (Physical Network Function Instance, abbreviated as PNF). As shown in the figure, in the commercial environment, the main and standby network elements or the mutual standby network elements are often deployed in the DC of two different areas, and disaster recovery switching management is performed on the main and standby network elements or the mutual standby network elements through the DRMC; the DRMC is independently deployed, disaster recovery management is carried out on the main and standby network elements of the two DCs, and main and standby switching is carried out when emergency situations are met, so that the use of users is not affected as much as possible.
One embodiment of the present application relates to a disaster recovery switching method applied to a disaster recovery management center DRMC, as shown in fig. 2, including:
step 101, disaster recovery monitoring data of a main network element are obtained.
In an example implementation, due to the difference of network element types, key service indexes in service index systems of the main network elements of all types are different, so that judging bases in judging processes of disaster recovery switching of the main network elements of all types are different.
In an example implementation, when disaster recovery monitoring data of a main network element are acquired, firstly, an identity of the main network element needs to be acquired, a monitoring system of a network element index state corresponding to the identity of the main network element is selected from a preset monitoring system library according to the identity of the main network element, then each disaster recovery monitoring index of the network element is monitored through the monitoring system of the disaster recovery monitoring index of the main network element type, and each monitored disaster recovery monitoring index forms the disaster recovery monitoring data of the main network element.
In an example implementation, taking a call session control function (Call Session Control Function, abbreviated as CSCF) network element as an example, disaster recovery monitoring indexes of the CSCF network element include an initialization registration success rate, a refresh registration success rate, a network connection rate, a Cx/Dx interface success rate, a bandwidth utilization rate, a central processing unit (Central Processing Unit, abbreviated as CPU) utilization rate, a memory utilization rate, a container database (Container DataBase, abbreviated as CDB) memory utilization rate, and the like, and each disaster recovery monitoring index of the CSCF network element is a key service index in a service index system of the CSCF network element.
Step 102, processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on the decision tree to generate a disaster recovery decision instruction.
In an example implementation, each type of main network element has a disaster recovery decision model corresponding to the main network element, and the disaster recovery decision model corresponding to the main network element can be directly obtained according to the network element identification of the main network element; the disaster recovery decision model is a decision analysis model established based on a decision tree, and each node in the disaster recovery decision model is used as each disaster recovery monitoring index of a main network element and a disaster recovery switching condition of each disaster recovery monitoring index; the disaster recovery monitoring indexes of each node in the disaster recovery decision model can also be used for building the monitoring system mentioned in step 101, that is, the types of the disaster recovery monitoring indexes in the monitoring system of each main network element are consistent with the types of the disaster recovery monitoring indexes of each node in the disaster recovery monitoring model of each main network element.
In an example implementation, taking a CSCF network element as an example, a disaster recovery decision model of the CSCF network element is composed of an initial registration success rate (a first decision node), a refresh registration success rate (a second decision node), a network connection rate (a third decision node), and the like, and disaster recovery monitoring data of the CSCF network element obtained in step 101 includes the initial registration success rate, the refresh registration success rate, the network connection rate, and the like, when the disaster recovery decision model is used for processing, it is first determined whether the initial registration success rate of the first decision node meets a preset registration success rate condition, and when the initial registration success rate of the first decision node meets a preset registration success rate condition, processing of the second decision node is performed, and so on until all decision nodes are processed.
And 103, when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library.
In an example implementation, when each disaster recovery monitoring index in the disaster recovery decision model meets a preset processing condition, the disaster recovery decision instruction output by the disaster recovery decision model is not to trigger disaster recovery switching, which indicates that the network element of the main network element has good working state and does not need to perform disaster recovery switching; when each disaster recovery monitoring index in the disaster recovery decision model has a disaster recovery monitoring index which does not meet a preset processing condition, the disaster recovery decision instruction output by the disaster recovery decision model is disaster recovery switching trigger, which indicates that the network element working state of the main network element is poor or the main network element fails, and disaster recovery switching is needed; at this time, the disaster recovery switching workflow of the main network element can be obtained from a preset workflow library according to the network element identification of the main network element.
And 104, running the disaster recovery switching workflow to finish the disaster recovery switching of the main network element.
In an example implementation, after a disaster recovery switching workflow of a main network element is obtained, the disaster recovery switching workflow is operated; the work flow of the disaster recovery switching flow is as follows: checking before switching, releasing the call, switching and stopping releasing the call; the checking before switching refers to performing state detection or switching confirmation on the main network element and the standby network element of the main network element before switching, and starting switching operation after the state detection or switching confirmation passes; releasing the call means that the main network element releases the deployed and executing service on the main network element, and initiates a scheduling request to the standby network element; the switching means that the standby network element receives the scheduling request of the main network element and then prepares the service scheduling value on the main network element on the standby network element; stopping releasing the call means that the main network element sends a stopping scheduling request to the standby network element when confirming that the own service is scheduled to the standby network element; in the process of disaster recovery switching, the service scheduling to the main network element needs to be stopped.
In the embodiment of the application, in the process of carrying out disaster recovery switching on the main network element, disaster recovery monitoring data of the main network element are obtained; processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction; when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library; the disaster recovery switching workflow is operated to complete the disaster recovery switching of the main network element; the disaster recovery decision model established based on the decision tree is utilized to carry out disaster recovery judgment on the disaster recovery monitoring data of the main network element, and the decision tree analysis can carry out quick and feasible and good-effect results on the data source in a relatively short time, so that the speed and accuracy of disaster recovery switching judgment of the application can be improved; meanwhile, the disaster recovery flow of the main network element is abstracted into steps according to the workflow principle, after the disaster recovery operation is triggered after the analysis of the decision tree, the whole process control is carried out through the workflow, so that human intervention is not needed when the disaster recovery switching is carried out, and the automatic disaster recovery switching is realized; the technical problems that in the prior art, because disaster recovery switching is carried out by manual operation, the complexity of a disaster recovery switching command and human factors often cause long operation time and poor accuracy are solved.
The embodiment of the application relates to a method for generating a disaster recovery decision model used in a disaster recovery switching method, which is applied to a Disaster Recovery Management Center (DRMC), as shown in fig. 3, and comprises the following steps:
step 201, obtaining a disaster recovery monitoring data sample of a main network element, wherein the disaster recovery monitoring data sample comprises each disaster recovery monitoring index.
In an example implementation, the disaster recovery monitoring data sample is at least composed of one piece of historical disaster recovery monitoring data, and one piece of historical disaster recovery monitoring data includes all disaster recovery monitoring indexes of the main network element, that is, the historical disaster recovery monitoring data is composed of key service indexes (i.e. disaster recovery monitoring indexes) in a service index system of the main network element.
Step 202, calculating the basic entropy of the disaster recovery monitoring data sample and the characteristic entropy of each disaster recovery monitoring index.
In an example implementation, the basic entropy H (x) of the disaster recovery monitoring data sample is used to represent the chaotic degree of the disaster recovery monitoring data sample, and the calculation formula of the basic entropy is as follows: h (x) = - Σp (x) i )log 2 P(x i ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein P (x) i ) The probability of occurrence of the ith disaster recovery monitoring index in the plurality of historical disaster recovery monitoring data is represented.
In an example implementation, taking CSCF network elements as an example, one method for calculating the base entropy H (x) is: in CSCF disaster recovery monitoring data samples containing 500 pieces of historical disaster recovery monitoring data, the number of the historical disaster recovery monitoring data for carrying out disaster recovery is 5, and the number of the historical disaster recovery monitoring data for carrying out disaster recovery is 495, then the basis entropy H (x) = - (5/500) log is obtained 2 P(5/500)-(495/500)log 2 P(495/500)。
In an example implementation, the characteristic entropy of each disaster recovery monitoring index refers to uncertainty of occurrence of a disaster recovery switching event X under the condition of a known disaster recovery monitoring index a, denoted as H (x|a), defined as mathematical expectation of entropy of conditional probability distribution of the disaster recovery monitoring index a under a given condition X on the disaster recovery monitoring index a, and the calculation formula is as follows:
Figure BDA0003377623250000051
in an example implementation, taking a CSCF network element as an example, in CSCF disaster recovery monitoring data samples including 500 pieces of historical disaster recovery monitoring data, wherein the historical disaster recovery monitoring data about occurrence of a disaster recovery when the CPU occupancy rate is above 99% is 100 pieces, wherein the historical disaster recovery monitoring data for performing disaster recovery is 30 pieces, and the historical disaster recovery monitoring data for not performing disaster recovery is 70 pieces, and the calculation formula of the feature entropy about the CPU occupancy rate is:
Figure BDA0003377623250000052
step 203, obtaining the information gain of each disaster recovery monitoring index according to the basic entropy and the characteristic entropy of each disaster recovery monitoring index.
In an example implementation, after the characteristic entropy H (x|a) of each disaster recovery monitoring index and the basic entropy H (X) of the disaster recovery monitoring data sample are obtained, a difference between the basic entropy and the characteristic entropy is used as a calculation formula of the information Gain (X, a) of each disaster recovery monitoring index, where Gain (X, a) =h (X) -H (x|a).
Step 204, sorting the disaster recovery monitoring indexes according to the information gains, and determining the positions of the decision tree nodes of the disaster recovery monitoring indexes according to the sorting results.
In an example implementation, after the information gain of each disaster recovery monitoring index is obtained, the disaster recovery monitoring indexes are ordered according to the magnitude of the information gain in order from large to small, the disaster recovery monitoring index with the largest information gain value is used as the first decision node of the decision tree, and the disaster recovery monitoring index with the smallest information gain value is used as the last decision node of the decision tree.
Step 205, generating a disaster recovery decision model according to the decision tree node positions of the disaster recovery monitoring indexes and the disaster recovery monitoring indexes.
In an example implementation, after determining the positions of the decision nodes of each disaster recovery monitoring index, a disaster recovery decision model may be generated according to the disaster recovery monitoring index of each decision node and the processing conditions corresponding to each disaster recovery monitoring index; such as: and if the network communication rate of the disaster recovery monitoring index is less than 99%, generating a disaster recovery decision instruction triggered by disaster recovery switching by the disaster recovery decision model, and if the network communication rate is greater than or equal to the 99%, processing the next decision node.
According to the method and the device for generating the disaster recovery decision model, based on other embodiments, each main network element can be generated according to the historical disaster recovery monitoring data samples, so that the relation between the disaster recovery decision model and the main network element is one-to-one, various types of main network elements can be processed, and the universality of the method and the device is improved.
One embodiment of the present application relates to a method for generating a disaster recovery switching workflow used in a disaster recovery switching method, which is applied to a disaster recovery management center DRMC, as shown in fig. 3, and includes:
step 301, obtaining network element configuration information and a network element backup policy of a main network element.
In an example implementation, when the disaster recovery switching is performed by the main network element for the first time or the disaster recovery switching workflow of the main network element cannot be obtained from the workflow library, network element configuration information and a network element backup policy of the main network element should also be obtained, where the network element configuration information refers to basic information generated during network element configuration, and the network element backup policy refers to a backup mode of the main network element and a network element identifier of a backup network element, and the backup mode includes 1+1 mutual backup, n+1 main backup, POOL networking and the like.
Step 302, generating a disaster recovery switching workflow and a disaster recovery workflow according to the network element configuration information and the network element backup strategy, and storing the disaster recovery switching workflow and the disaster recovery workflow in a workflow library.
In an example implementation, a preset disaster recovery switching workflow template is supplemented to be complete according to network element configuration information and a network element backup strategy, so that the disaster recovery switching workflow of the network element can be generated.
In an example implementation, the preset disaster recovery workflow template may be supplemented completely according to the network element configuration information and the network element backup policy, so as to generate the disaster recovery workflow of the network element.
According to the embodiment of the application, on the basis of other embodiments, for each main network element, the disaster recovery switching workflow and the disaster recovery workflow of the main network element can be generated according to the network element configuration information and the network element backup measurement, so that various types of main network elements can be processed, and the universality of the application is improved.
One embodiment of the present application relates to a disaster recovery switching method applied to a disaster recovery management center DRMC, as shown in fig. 5, including:
step 401, disaster recovery monitoring data of a main network element is obtained.
In an example implementation, the step is substantially the same as step 101 in the embodiment of the present application, and is not described here in detail.
Step 402, processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on the decision tree, and generating a disaster recovery decision instruction.
In an example implementation, the step is substantially the same as step 102 in the embodiment of the present application, and is not described here in detail.
Step 403, when the disaster recovery decision instruction is a disaster recovery switching trigger, acquiring network element state data of the standby network element corresponding to the main network element.
In an example implementation, when a disaster recovery decision instruction output by the disaster recovery decision model is a disaster recovery switching trigger, network element states of all the standby network elements corresponding to the main network element are obtained according to a backup policy of the main network element and network element identifiers of the standby network element.
Step 404, detecting the network element state data by using a preset state detection model to obtain the network element state of the standby network element.
In an example implementation, the preset state monitoring model may be a decision tree model (the construction method is consistent with the construction method of the disaster recovery decision model) established according to the network element state indexes in the network element state data, or any other model capable of performing state judgment; and detecting the network element state data by using a state monitoring model, wherein when all the network element state indexes in the network element state data are normal, the network element state of the standby network element is normal, otherwise, the network element state of the standby network element is abnormal.
Step 405, when the network element state of the standby network element is that the network element is normal, acquiring the disaster recovery switching workflow of the main network element from a preset workflow library.
In an example implementation, only when the state of the network element of the standby network element is normal, the disaster recovery switching can be performed on the main network element, and the disaster recovery switching workflow corresponding to the main network element is obtained from the workflow library. And when the state of the network element of the standby network element is abnormal, an alarm can be sent to a manager through the monitoring panel.
And step 406, running the disaster recovery switching workflow to complete the disaster recovery switching of the main network element.
In an example implementation, the step is substantially the same as step 104 in the embodiment of the present application, and is not described here in detail.
According to the embodiment of the application, the state detection can be performed on the standby network element before the disaster recovery switching of the main network element on the basis of other embodiments, and the disaster recovery switching of the main network element is performed only when the standby network element is in a normal state, so that the secondary failure of the network element when the standby network element is in a poor state is avoided.
One embodiment of the present application relates to a disaster recovery switching method applied to a disaster recovery management center DRMC, as shown in fig. 6, including:
step 501, disaster recovery monitoring data of a main network element is obtained.
In an example implementation, the step is substantially the same as step 101 in the embodiment of the present application, and is not described here in detail.
Step 502, processing disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree, and generating a disaster recovery decision instruction.
In an example implementation, the step is substantially the same as step 102 in the embodiment of the present application, and is not described here in detail.
Step 503, when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library.
In an example implementation, the step is substantially the same as step 103 in the embodiment of the present application, and is not described here in detail.
And step 504, running the disaster recovery switching workflow to finish the disaster recovery switching of the main network element.
In an example implementation, the step is substantially the same as step 104 in the embodiment of the present application, and is not described here in detail.
And step 505, when the disaster recovery monitoring data of the main network element is updated, processing the updated disaster recovery monitoring data by utilizing the disaster recovery decision model to generate an updated disaster recovery decision instruction.
In an example implementation, when the monitoring system monitors that the disaster recovery detection data of the main network element is updated, the updated disaster recovery detection data is input into a disaster recovery decision model for processing, and a disaster recovery decision instruction of the main network element is updated.
Step 506, when the disaster recovery decision instruction is updated as a disaster recovery trigger, obtaining a disaster recovery workflow of the main network element from the workflow library.
In an example implementation, when the disaster recovery decision command is a disaster recovery trigger, a disaster recovery workflow of the main network element is obtained from a workflow library to perform disaster recovery operation of the main network element, and when the disaster recovery decision command is still a disaster recovery switch trigger, the disaster recovery switch is kept unchanged, the disaster recovery operation is not performed on the main network element, and the next update of disaster recovery monitoring data of the main network element is waited.
And step 507, operating a disaster recovery workflow to complete disaster recovery of the main network element.
In an example implementation, the disaster recovery process is actually the reverse of the disaster recovery switching, and the service on the standby network element can be rescheduled to the main network element by executing the disaster recovery workflow.
According to the embodiment of the application, the disaster recovery process of the main network element can be abstracted into steps on the basis of other embodiments, the steps are accepted after disaster recovery operation is triggered, and the whole process control is performed through workflow after disaster recovery operation is triggered, so that disaster recovery is automatically performed.
The above steps of the methods are divided, for clarity of description, and may be combined into one step or split into multiple steps when implemented, so long as they include the same logic relationship, and they are all within the protection scope of this patent; it is within the scope of this patent to add insignificant modifications to the algorithm or flow or introduce insignificant designs, but not to alter the core design of its algorithm and flow.
Another embodiment of the present application relates to a disaster recovery switching system, and details of the disaster recovery switching system of the present embodiment are specifically described below, and the following is only implementation details provided for understanding, but not necessary for implementing the present embodiment, and fig. 7 is a schematic diagram of the disaster recovery switching system of the present embodiment, including: a first acquisition module 601, a decision module 602, a second acquisition module 603, and a switching module 604.
The first obtaining module 601 is configured to obtain disaster recovery monitoring data of a main network element.
The decision module 602 is configured to process the disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree, and generate a disaster recovery decision instruction.
The second obtaining module 603 is configured to obtain, when the disaster recovery decision instruction is a disaster recovery switching trigger, a disaster recovery switching workflow of the main network element from a preset workflow library.
And the switching module 604 is configured to operate the disaster recovery switching workflow to complete disaster recovery switching of the main network element.
In an example implementation, the disaster recovery switching system may further be provided with a monitoring panel, so that a manager may intuitively obtain real-time situations and change trends of each disaster recovery monitoring index of the network element.
In an example implementation, the disaster recovery switching system may further be provided with a task orchestration interface, configured to generate new construction and orchestration of a monitoring system of each type of main network element, new construction and operation of a disaster recovery decision model, and support orchestration modification of a disaster recovery switching workflow and a disaster recovery workflow based on a preset disaster recovery switching workflow template and a disaster recovery workflow template.
In an example implementation, the disaster recovery switching system may further be provided with a task execution management interface, configured to perform man-machine interaction with a manager, so that the manager may participate in a process of disaster recovery switching and disaster recovery.
It is to be noted that this embodiment is a system embodiment corresponding to the above-described method embodiment, and this embodiment may be implemented in cooperation with the above-described method embodiment. The related technical details and technical effects mentioned in the above embodiments are still valid in this embodiment, and in order to reduce repetition, they are not described here again. Accordingly, the related technical details mentioned in the present embodiment can also be applied to the above-described embodiments.
It should be noted that, each module involved in this embodiment is a logic module, and in practical application, one logic unit may be one physical unit, or may be a part of one physical unit, or may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present application, elements that are not so close to solving the technical problem presented in the present application are not introduced in the present embodiment, but it does not indicate that other elements are not present in the present embodiment.
Another embodiment of the present application relates to an electronic device, as shown in fig. 8, comprising: at least one processor 701; and a memory 702 communicatively coupled to the at least one processor 701; the memory 702 stores instructions executable by the at least one processor 701, where the instructions are executed by the at least one processor 701, so that the at least one processor 701 can execute the disaster recovery switching method in the foregoing embodiments.
Where the memory and the processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting the various circuits of the one or more processors and the memory together. The bus may also connect various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or may be a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over the wireless medium via the antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory may be used to store data used by the processor in performing operations.
Another embodiment of the present application relates to a computer-readable storage medium storing a computer program. The computer program implements the above-described method embodiments when executed by a processor.
That is, it will be understood by those skilled in the art that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program stored in a storage medium, where the program includes several instructions for causing a device (which may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps in the methods of the embodiments described herein. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples of implementing the present application and that various changes in form and details may be made therein without departing from the spirit and scope of the present application.

Claims (10)

1. The disaster recovery switching method is characterized by comprising the following steps:
acquiring disaster recovery monitoring data of a main network element;
processing the disaster recovery monitoring data by using a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction;
when the disaster recovery decision instruction is disaster recovery switching trigger, acquiring a disaster recovery switching workflow of the main network element from a preset workflow library;
and operating the disaster recovery switching workflow to finish the disaster recovery switching of the main network element.
2. The method for disaster recovery switching according to claim 1, wherein the obtaining disaster recovery monitoring data of the primary network element comprises:
obtaining a disaster recovery monitoring data sample of the main network element, wherein the disaster recovery monitoring data sample comprises various disaster recovery monitoring indexes;
calculating the basic entropy of the disaster recovery monitoring data sample and the characteristic entropy of each disaster recovery monitoring index;
acquiring information gain of each disaster recovery monitoring index according to the basic entropy and the characteristic entropy of each disaster recovery monitoring index;
sequencing the disaster recovery monitoring indexes according to the information gains, and determining the decision tree node positions of the disaster recovery monitoring indexes according to sequencing results;
and generating the disaster recovery decision model according to the decision tree node positions of the disaster recovery monitoring indexes and the disaster recovery monitoring indexes.
3. The disaster recovery switching method according to claim 2, wherein the obtaining disaster recovery monitoring data of a main network element includes: and acquiring disaster recovery monitoring data of the main network element according to disaster recovery monitoring indexes of each decision tree node of the disaster recovery decision model.
4. The disaster recovery switching method according to claim 1, wherein the obtaining the disaster recovery switching workflow of the main network element from a preset workflow library includes:
acquiring network element state data of a standby network element corresponding to the main network element;
detecting the network element state data by using a preset state detection model to obtain the network element state of the standby network element;
and when the network element state of the standby network element is normal, acquiring the disaster recovery switching workflow of the main network element from the workflow library.
5. The disaster recovery switching method of claim 1, wherein the method further comprises:
acquiring network element configuration information and a network element backup strategy of the main network element;
and generating the disaster recovery switching workflow and the disaster recovery workflow according to the network element configuration information and the network element backup strategy, and storing the disaster recovery switching workflow and the disaster recovery workflow into the workflow library.
6. The disaster recovery switching method according to claim 5, wherein the operation of the disaster recovery switching workflow completes the disaster recovery switching of the main network element, and further comprising:
when the disaster recovery monitoring data of the main network element is updated, the disaster recovery decision model is utilized to process the updated disaster recovery monitoring data, and an updated disaster recovery decision instruction is generated;
when the disaster recovery decision updating instruction is a disaster recovery trigger, acquiring the disaster recovery workflow of the main network element from the workflow library;
and operating the disaster recovery workflow to complete disaster recovery of the main network element.
7. The disaster recovery switching method according to claim 1, wherein the operation of the disaster recovery switching workflow to complete the disaster recovery switching of the main network element comprises:
when the main network element and the standby network element both meet the disaster recovery switching condition, releasing the service on the main network element and stopping scheduling the service to the main network element;
and dispatching the service on the main network element to the standby network element to complete disaster recovery switching of the main network element.
8. The disaster recovery switching system is characterized by comprising:
the first acquisition module is used for acquiring disaster recovery monitoring data of the main network element;
the decision module is used for processing the disaster recovery monitoring data by utilizing a preset disaster recovery decision model established based on a decision tree to generate a disaster recovery decision instruction;
the second obtaining module is used for obtaining the disaster recovery switching workflow of the main network element from a preset workflow library when the disaster recovery decision instruction is disaster recovery switching trigger;
and the switching module is used for operating the disaster recovery switching workflow and finishing the disaster recovery switching of the main network element.
9. An electronic device, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the disaster recovery switching method of any one of claims 1 to 7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the disaster recovery switching method of any one of claims 1 to 7.
CN202111421670.7A 2021-11-26 2021-11-26 Disaster recovery switching method, system, electronic equipment and storage medium Pending CN116193384A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111421670.7A CN116193384A (en) 2021-11-26 2021-11-26 Disaster recovery switching method, system, electronic equipment and storage medium
PCT/CN2022/126000 WO2023093379A1 (en) 2021-11-26 2022-10-18 Disaster recovery switching method and system, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111421670.7A CN116193384A (en) 2021-11-26 2021-11-26 Disaster recovery switching method, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116193384A true CN116193384A (en) 2023-05-30

Family

ID=86438812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111421670.7A Pending CN116193384A (en) 2021-11-26 2021-11-26 Disaster recovery switching method, system, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN116193384A (en)
WO (1) WO2023093379A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116566805B (en) * 2023-07-10 2023-09-26 中国人民解放军国防科技大学 System disaster-tolerant and anti-destruction oriented node cross-domain scheduling method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102056207B (en) * 2009-10-29 2015-04-01 中兴通讯股份有限公司 Method and system for realizing disaster recovery and switching
CN107070684A (en) * 2016-12-12 2017-08-18 国网北京市电力公司 Disaster tolerance reverse method and device
CN108932180A (en) * 2018-06-21 2018-12-04 郑州云海信息技术有限公司 A kind of disaster tolerance management method, device, storage medium and computer equipment matter
US11093354B2 (en) * 2018-09-19 2021-08-17 International Business Machines Corporation Cognitively triggering recovery actions during a component disruption in a production environment
CN110569149B (en) * 2019-09-16 2023-07-25 上海新炬网络技术有限公司 Method for triggering Oracle disaster recovery automatic emergency switching based on fault detection
CN110635950A (en) * 2019-09-30 2019-12-31 深圳供电局有限公司 Double-data-center disaster recovery system

Also Published As

Publication number Publication date
WO2023093379A1 (en) 2023-06-01

Similar Documents

Publication Publication Date Title
CN108632365B (en) Service resource adjusting method, related device and equipment
CN110289965B (en) Application program service management method and device
CN110912972B (en) Service processing method, system, electronic equipment and readable storage medium
CN108616424B (en) Resource scheduling method, computer equipment and system
CN109802986B (en) Equipment management method, system, device and server
CN103684878A (en) Operating command parameter control method and device
CN113986478A (en) Resource migration strategy determination method and device
CN111339194B (en) Automatic scheduling method and device for database access layer middleware
CN116193384A (en) Disaster recovery switching method, system, electronic equipment and storage medium
CN112073555A (en) Method for configuring IP address, electronic device and computer readable storage medium
CN113220459B (en) Task processing method and device
CN113342499B (en) Distributed task calling method, device, equipment, storage medium and program product
CN113992509B (en) SDN network service configuration issuing method, device and storage medium
CN115687019A (en) Database cluster fault processing method, intelligent monitoring platform, equipment and medium
US20230246911A1 (en) Control device, control method, control program and control system
EP3467655A1 (en) System and method for mpi implementation in an embedded operating system
CN112804087B (en) Method, device, equipment and storage medium for realizing operation of alliance network
CN115617478A (en) Task processing method, device, system, equipment and storage medium
CN114490000A (en) Task processing method, device, equipment and storage medium
CN110018906B (en) Scheduling method, server and scheduling system
CN113010290A (en) Task management method, device, equipment and storage medium
CN111327663A (en) Bastion machine distribution method and equipment
CN110543363A (en) Virtual machine management method in cloud computing environment
US20240244411A1 (en) Provision of virtual infrastructure information through r1 interface
CN111245938B (en) Robot cluster management method, robot cluster, robot and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination