WO2017050130A1 - 一种故障恢复方法及装置 - Google Patents

一种故障恢复方法及装置 Download PDF

Info

Publication number
WO2017050130A1
WO2017050130A1 PCT/CN2016/098344 CN2016098344W WO2017050130A1 WO 2017050130 A1 WO2017050130 A1 WO 2017050130A1 CN 2016098344 W CN2016098344 W CN 2016098344W WO 2017050130 A1 WO2017050130 A1 WO 2017050130A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
network element
determining
processing unit
fault
Prior art date
Application number
PCT/CN2016/098344
Other languages
English (en)
French (fr)
Inventor
张文革
徐日东
陈勇
刘清明
陈太洲
熊福祥
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP16848012.7A priority Critical patent/EP3340535B1/en
Priority to JP2018514977A priority patent/JP6556346B2/ja
Publication of WO2017050130A1 publication Critical patent/WO2017050130A1/zh
Priority to US15/928,367 priority patent/US10601643B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0686Additional information in the notification, e.g. enhancement of specific meta-data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0894Policy-based network configuration management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0895Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/40Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/20Arrangements for monitoring or testing data switching networks the monitoring system or the monitored elements being virtualised, abstracted or software-defined entities, e.g. SDN or NFV

Definitions

  • the present application relates to the field of network data processing, and in particular, to a fault recovery method and apparatus.
  • the recovery method for the failure can be performed manually.
  • the time and labor cost of manually detecting a fault and then restoring the fault is usually higher. Therefore, the industry gradually tends to automatically recover the failure of the communication system itself through the devices in the communication system, thereby improving the fault recovery efficiency and reducing the labor cost.
  • the fault recovery method in the prior art mainly determines whether the device is faulty according to the heartbeat message of the device. Specifically, the monitoring device may periodically send a heartbeat message to the monitored device, and after receiving the heartbeat message, the monitored device may return a response message to the monitoring device. If the monitoring device does not receive the response message returned by the monitored device within the specified time after transmitting the heartbeat message, it determines that the monitored device is faulty, and then resets the monitored device as a whole, or the monitored device is The hosted function is switched to another device to recover from the failure.
  • the monitoring device does not receive the response message within the specified time. For example, it may be that the interface unit used by the monitored device to send the response message fails. At this time, other interface units of the monitored device may be called to replace the foregoing interface unit without resetting or functionally switching the entire monitored device. The risk of resetting or functionally switching the monitored device as a whole is high, and more services are affected.
  • the purpose of the present application is to provide a fault recovery method and apparatus, which can locate a fault through key performance indicator information, and solve the problem of low accuracy of positioning a fault according to a heartbeat message of the device.
  • the application provides a fault recovery method, including:
  • the determining the fault object includes:
  • the network element level fault recovery policy is used to: Performing a fault recovery operation inside the monitored network element.
  • the determining the fault object includes:
  • the network level fault recovery strategy is used to One or more network elements in the network in which the monitored network element is located perform a fault recovery operation.
  • the determining that the fault object is the service processing unit in the monitored network element includes:
  • the service processing unit that determines that the service success rate is lower than the first reference value is the fault object.
  • the comparing the service success rate with the first reference value specifically includes:
  • the homogenization service processing unit is a service processing unit that is the same as the service logic of the service carried by the service processing unit, and the service is discretely allocated.
  • the service processing unit that determines that the service success rate is lower than the first reference value is the fault object. Previously, it also included:
  • the homogenization service processing unit is the same as the service logic of the service carried by the service processing unit, and the service processing unit to which the service is discretely allocated.
  • the determining that the fault object is the communication path between the service processing units includes:
  • the communication path that determines that the service success rate is lower than the third reference value is the fault object.
  • the determining that the fault object is the monitored network element includes:
  • the comparing the service success rate with the second reference value includes:
  • the homogenized network element is a monitored network element whose service logic of the carried service is the same as the monitored network element, and the service is discretely allocated.
  • the sending the fault recovery policy to the management unit in the network function virtualization system includes:
  • the sending the fault recovery policy to the management unit in the network function virtualization system includes:
  • the failure recovery strategy is sent to the management and orchestration MANO unit in the network function virtualization system.
  • the method further includes:
  • the network level fault recovery policy is configured to perform a fault recovery operation on one or more network elements in the network in which the monitored network element is located.
  • the determining a network level fault recovery policy includes:
  • fault recovery indication information is used to instruct the management unit to replace the monitored network element determined to be a fault object by using the redundant network element in a normal working state
  • determining the network-level fault recovery policy specifically: acquiring state information of the redundant network element of the back-end network element in the communication path that is determined to be the fault object;
  • the application provides a fault recovery apparatus, including:
  • An obtaining unit configured to acquire key performance indicator information of each service processing unit in the monitored network element
  • a determining unit configured to determine a fault object according to the key performance indicator information
  • a sending unit configured to send the fault recovery policy to a management unit in the network function virtualization system, so that the management unit uses the fault recovery policy to perform fault recovery.
  • the determining unit is specifically configured to:
  • the network element level fault recovery policy is used to: Performing a fault recovery operation inside the monitored network element.
  • the determining unit is specifically configured to:
  • the network level fault recovery strategy is used to One or more network elements in the network in which the monitored network element is located perform a fault recovery operation.
  • the determining unit is specifically configured to:
  • the service processing unit that determines that the service success rate is lower than the first reference value is the fault object.
  • the determining unit is specifically configured to:
  • the homogenization service processing unit is a service processing unit that is the same as the service logic of the service carried by the service processing unit, and the service is discretely allocated.
  • the determining unit is further configured to:
  • the homogenization service processing unit is the same as the service logic of the service carried by the service processing unit, and the service processing unit to which the service is discretely allocated.
  • the determining unit is specifically configured to:
  • the communication path that determines that the service success rate is lower than the third reference value is the fault object.
  • the determining unit is specifically configured to:
  • the determining unit is specifically configured to:
  • the homogenized network element is a monitored network element whose service logic of the carried service is the same as the monitored network element, and the service is discretely allocated.
  • the sending unit is specifically configured to:
  • the sending unit is specifically configured to:
  • the determining unit is further configured to:
  • the network level fault recovery policy is configured to perform a fault recovery operation on one or more network elements in the network in which the monitored network element is located.
  • the acquiring unit is further configured to:
  • the determining unit is further configured to determine, according to the state information, a redundant network element that is in a normal working state;
  • fault recovery indication information is used to instruct the management unit to replace the monitored network element determined to be a fault object by using the redundant network element in a normal working state
  • the acquiring unit is further configured to: acquire state information of the redundant network element of the back-end network element in the communication path that is determined to be the fault object;
  • the determining unit is further configured to: determine, according to the state information, a redundant network element that is in a normal working state;
  • the present application discloses the following technical effects:
  • the fault recovery method or device disclosed in the present application obtains key performance indicator information of each service processing unit in the monitored network element; determines a fault object according to the key performance indicator information; and determines a fault recovery strategy according to the fault object
  • the fault recovery strategy is sent to the management unit in the network function virtualization system; the fault can be located through the key performance indicator information, and the problem that the fault is located according to the heartbeat message of the network element is solved.
  • the fault recovery strategy is determined according to the fault object; the fault recovery policy is sent to the management unit in the network function virtualization system; therefore, an appropriate fault recovery strategy can be adopted to reduce the risk brought by the fault recovery process. Reduce the impact of the failure recovery process on the business.
  • NVM Network Function Virtualization
  • Embodiment 1 of a fault recovery method according to the present application
  • Embodiment 3 is a flowchart of Embodiment 2 of a fault recovery method according to the present application.
  • Embodiment 4 is a flowchart of Embodiment 3 of a fault recovery method according to the present application.
  • FIG. 5 is a structural diagram of an embodiment of a fault recovery device of the present application.
  • FIG. 6 is a structural diagram of a computing node of the present application.
  • FIG. 1 is an architectural diagram of a Network Function Virtualization (NFV) system of the present application.
  • the fault recovery method of the present application is mainly applied to an NFV system.
  • the NFV system mainly includes the following network elements:
  • the Operations Support System (OSS)/Business Support System (BSS) is used to initiate service requests to the Network Function Virtualization Orchestrator (NFV Orchestrator) and the resources required for the service, and is responsible for troubleshooting.
  • OSS Operations Support System
  • BSS Business Support System
  • Orchestrator responsible for implementing NFV services according to OSS/BSS service requests; responsible for network service (NS) lifecycle management, orchestrating management resources, real-time monitoring of virtualized network functions (VNF) , Network Function Virtualization Infrastructure (NFVI) resources and operational status information.
  • NS network service
  • VNF virtualized network functions
  • NFVI Network Function Virtualization Infrastructure
  • VNF Manager is responsible for VNF generation cycle management, such as startup, time to live, and VNF operational status information.
  • VIM Virtualized Infrastructure Manager
  • EMS Element Management System
  • NFVI resources Includes all NFVI resource status, available/reserved/allocated NFVI resources.
  • the execution body of the fault recovery method of the present application may be a network element key performance indicator (Key Performance) Indicator, KPI) monitoring and recovery decision module or.
  • KPI Key Performance Indicator
  • the network element KPI monitoring and recovery decision module or the network KPI monitoring and recovery decision module may be deployed on a VNF, EMS, Management and Orchestrator (MANO) unit, or an independent network node in the NFV system. Both can be physically deployed together or separately.
  • FIG. 2 is a flowchart of Embodiment 1 of a fault recovery method according to the present application.
  • the execution body of the method in this embodiment may be a network element KPI monitoring and recovery decision module or a network KPI monitoring and recovery decision module. As shown in FIG. 2, the method may include:
  • Step 101 Obtain Key Performance Indicator (KPI) information of each service processing unit in the monitored network element.
  • KPI Key Performance Indicator
  • the monitored network element may be a network element in a Network Function Virtualization (NFV) system, such as a VNF.
  • NFV Network Function Virtualization
  • the monitored network element may have one or more service processing units.
  • the key performance indicator information may include information such as the number of service requests received by the service processing unit, the number of failures of the service corresponding to the number of service requests, and/or the reason for each service failure.
  • the types of information included in the key performance indicator information can be set according to requirements.
  • the key performance indicator information may further include service delay information and the like.
  • the monitored network element may periodically report the key performance indicator information.
  • the network element to be monitored may also be determined according to the information of the EMS, and/or MANO.
  • the information about the service processing unit deployed in the network element recorded by the EMS and/or the MANO, and the network element information deployed on the network, and the network element corresponding to the network element information deployed on the recorded network are determined to be the monitored
  • the network processing unit determines the service processing unit corresponding to the service processing unit information deployed in the recorded network element as the service processing unit that needs to be monitored.
  • Step 102 Determine a fault object according to the key performance indicator information.
  • the success rate of the service processing unit performing the service may be calculated according to the key performance indicator information.
  • the success rate is lower than a certain ratio, it may be determined that the fault object is the service processing unit.
  • the number of service processing units with a lower success rate is larger (for example, 80% of the total number of service processing units of the monitored network element)
  • it may be determined that the fault object is a network element outside the monitored network element.
  • the communication path of the monitored network element to the next-level network element may be determined. The fault or the next-level NE fails.
  • Step 103 Determine a fault recovery strategy according to the fault object.
  • the fault recovery policy at the network element level may be determined; the fault recovery policy at the network element level is used to execute inside the monitored network element. Failure recovery operation.
  • a network level fault recovery policy may be determined;
  • the network level fault recovery policy is used to perform a fault recovery operation on one or more network elements in the network where the monitored network element is located.
  • Step 104 Send the fault recovery policy to a management unit in the network function virtualization system, so that the management unit uses the fault recovery policy to perform fault recovery.
  • the management unit may be a system management module in the monitored network element in the network function virtualization system, or may be a management and orchestration MANO unit in the network function virtualization system.
  • the service processing unit that fails and the standby unit may be isolated.
  • the standby network element of the network element at one end of the standby path may also be determined;
  • the service carried by the network element at one end of the standby path is switched to the standby network element.
  • the key performance indicator information of each service processing unit in the monitored network element is obtained; the fault object is determined according to the key performance indicator information; and the fault recovery strategy is determined according to the fault object.
  • the fault recovery strategy is sent to the management unit in the network function virtualization system; the fault can be located through the key performance indicator information, and the problem that the fault is located according to the heartbeat message of the network element is solved.
  • the fault recovery strategy is determined according to the fault object; the fault recovery policy is sent to the management unit in the network function virtualization system; therefore, an appropriate fault recovery strategy can be adopted to reduce the risk brought by the fault recovery process. Reduce the impact of the failure recovery process on the business.
  • the determining the fault object may specifically include:
  • the determining the fault recovery policy according to the fault object may specifically include:
  • the fault object is a communication between a service processing unit or the service processing unit in the monitored network element And determining, by the network element level, a fault recovery policy, where the network element level fault recovery policy is used to perform a fault recovery operation inside the monitored network element.
  • the determining the fault object may further include:
  • the fault object is determined to be the monitored network element.
  • the fault object is determined to be a communication path between the monitored network element and another network element.
  • the determining the fault recovery policy according to the fault object may specifically include:
  • the network level fault recovery strategy is used to One or more network elements in the network in which the monitored network element is located perform a fault recovery operation.
  • a network element level fault recovery strategy may be used for fault recovery; if the recovery fails, the network level may be used again.
  • the failure recovery strategy performs fault recovery.
  • the determining that the fault object is the service processing unit in the monitored network element may specifically adopt the following steps:
  • the service processing unit that determines that the service success rate is lower than the reference value is the fault object.
  • the number of service failures may be the number of service failures caused by the service processing unit itself.
  • the key performance indicator information may record the cause of the service failure, and may count the number of service failures caused by the service processing unit itself according to the reason of the service failure.
  • the reference value may be a preset value, or may be a homogenization reference value obtained according to an average service success rate of the homogenization service processing unit. Therefore, the comparing the service success rate with the reference value may specifically include:
  • the homogenization service processing unit is the same as the service logic of the service carried by the service processing unit, and the service processing unit to which the service is discretely allocated.
  • the homogenization service processing unit for the homogenization service processing unit, the following phenomenon may occur: for some reason, the service success rate of multiple homogenization service processing units is lower than the preset reference value, which is not necessarily Is lower than the preset reference value The homogenization business processing unit has failed. It may be that the service success rate of most homogeneous service processing units caused by other equipment failures is degraded. In the above case, in order to avoid erroneously determining that the homogenous service processing unit is faulty, and determining that the service processing unit whose service success rate is lower than the reference value is the fault object, the following steps may also be adopted:
  • the preset ratio may be set according to actual needs, for example, may be set to 90%. That is, when the service success rate of the homogenization service processing unit of 90% or more is higher than the preset reference value, and the service success rate of the homogenization service processing unit of 10% or less is lower than the preset reference value, The homogenization service processing unit that determines that the service success rate is lower than the reference value is the fault object.
  • the determining that the fault object is the communication path between the service processing units may include:
  • the communication path that determines that the service success rate is lower than the reference value is the fault object.
  • the determining that the fault object is a network element in the network to which the monitored network element belongs may include:
  • the monitored network element that determines that the service success rate is lower than a reference value is the fault object.
  • a network element may include multiple service processing units. Therefore, the key performance indicator information of each service processing unit in a network element may be obtained; the number of service requests received by the service processing unit included in the key performance indicator information of each service processing unit and the service corresponding to the service request number The number of failures is calculated, and the number of service requests received by the network element and the number of failures of the service corresponding to the number of service requests are counted, and the service success rate of the monitored network element is calculated.
  • the comparing the service success rate with the reference value may include:
  • the homogenized network element is a monitored network element whose service logic of the carried service is the same as the monitored network element, and the service is discretely allocated.
  • FIG. 3 is a flowchart of Embodiment 2 of a fault recovery method according to the present application.
  • the execution body of the method of this embodiment may be a network element KPI monitoring and recovery decision module. As shown in FIG. 3, the method may include:
  • Step 201 Acquire key performance indicator information of each service processing unit in the monitored network element.
  • the service processing unit may include a thread, a process, a virtual machine (VM), and the like.
  • the key performance indicator information may include at least the following information: the number of service requests received by the service processing unit and the number of failures of the service corresponding to the number of service requests.
  • Step 202 Calculate a service success rate of the service performed by the service processing unit according to the number of service requests received by the service processing unit and the number of failures of the service corresponding to the service request number in the key performance indicator information.
  • the service success rate may be obtained by subtracting the number of failures from the number of service requests, dividing by the number of service requests, and multiplying by 100%.
  • Step 203 Compare the service success rate with a reference value.
  • the reference value can be set according to actual needs. For example, when the service success rate of a normal service processing unit is above 95%, the reference value may be set to 95%.
  • the reference value may be calculated according to an average service success rate of the homogenization service processing unit.
  • the homogenization service processing unit is a service processing unit that has the same service logic and the same external service network as the service processing unit corresponding to the service success rate.
  • the service request messages received (distributed) by the plurality of homogenous service processing units are randomly discrete. Therefore, the service success rates of multiple homogeneous service processing units should be substantially similar. Therefore, the homogenization reference value can be calculated according to the average service success rate of the homogenization service processing unit.
  • the average service success rate may be subtracted from a preset value to obtain a homogenization reference value.
  • the preset value can be set according to actual needs. For example, it can be 20%, 10%, and so on.
  • Step 204 Determine that the service processing unit whose service success rate is lower than the reference value is the fault object.
  • Step 205 Determine, when the fault object is a service processing unit, a fault recovery strategy at a network element level.
  • Step 206 Send the fault recovery policy to a system management module in the monitored network element in the network function virtualization system.
  • the network element level fault recovery policy in step 205 may be to instruct the system management module to reset the service processing unit that has failed. After receiving the fault recovery policy of the network element level, the system management module may reset the service processing unit that has failed.
  • the service processing unit that fails may be isolated. Further, when it is determined that the number of isolated service processing units reaches a second preset threshold, a network level fault recovery policy may be performed; the network level fault recovery policy is used in the network where the monitored network element is located One or more network elements in the network perform a fault recovery operation. For example, the next hop failure network element or the communication path of the monitored network element may be switched or the like. The target network element or communication path that is switched can be selected according to the health status of each network element or communication path in the disaster recovery group.
  • the fault recovery policy may be: determining a standby unit of the service processing unit that fails; The service carried by the service processing unit is switched to the standby unit. Further, when it is determined that the standby unit is faulty, the service processing unit that fails and the standby unit may be isolated.
  • FIG. 4 is a flowchart of Embodiment 3 of a fault recovery method according to the present application.
  • the execution body of the method of this embodiment may be a network KPI monitoring and recovery decision module. As shown in FIG. 4, the method may include:
  • Step 301 Acquire key performance indicator information of each service processing unit in the monitored network element.
  • Step 302 Calculate a service success rate of a service performed by each service processing unit according to the number of service requests received by the service processing unit and the number of failures of the service corresponding to the service request number in the key performance indicator information.
  • Step 303 Compare the service success rate with a reference value.
  • Step 304 Determine the number of service processing units whose service success rate is lower than a reference value
  • Step 305 Determine, according to the quantity, a proportion of a service processing unit whose service success rate is lower than a reference value in all service processing units in the monitored network element;
  • the ratio is 80%.
  • Step 306 When the ratio is greater than a preset ratio, determine that the fault object is the monitored network element.
  • the preset ratio can be set according to actual needs.
  • the preset ratio can be set to 50%, 80%, and the like.
  • Step 307 Determine a network level fault recovery policy when the fault object is a network element in a network to which the monitored network element belongs.
  • a network-level fault recovery policy is required to repair the faulty network element.
  • the determining a network-level fault recovery strategy may have multiple implementation manners. For example, the following steps can be taken:
  • the fault recovery indication information is used to instruct the management unit to replace the monitored network element determined to be a fault object by using the redundant network element in the normal working state.
  • the above steps can ensure that redundant network elements used to replace the failed monitored network element can work normally. Such as If the redundant network elements of the monitored network element are abnormal, the faulty monitored network element can be replaced by the preset redundant network element, and other working network elements can be found to replace the occurrence. The monitored network element of the fault.
  • the above steps can ensure that the redundant network elements after switching can work normally. If the redundant network elements of the back-end network element in the communication path are abnormal, the preset redundant network element can be used for switching, and other network elements that can work normally can be searched for switching.
  • Step 308 Send the fault recovery policy to the management and orchestration MANO unit in the network function virtualization system.
  • the network-level fault recovery policy may instruct the MANO unit to determine a backup network element of the network element that has failed; and switch the service carried by the network element that has failed to the standby network element.
  • the MANO receives the network level fault recovery policy, and can determine the standby network element of the network element that has failed. After determining the backup network element of the network element that has failed, the MANO may send an indication signaling to the VNFM, instructing the VNFM to switch the service carried by the network element that has failed to the standby network element. After receiving the indication signaling, the VNFM may switch the service carried by the failed network element to the standby network element.
  • the key performance indicator information may further include a service failure reason information and a service failure number information caused by the service failure reason.
  • the reason for the service failure may include: timeout to the downstream network element communication, insufficient resources, communication timeout between internal modules of the monitored network element, internal software errors (eg, illegal internal data of the software, code entering the abnormal branch, etc.). Therefore, determining the fault object according to the key performance indicator information in the application may further include:
  • the fault object is determined based on the service failure reason information included in the key performance indicator information.
  • the network element in the network to which the monitored network element belongs may include an external network element of the network element and the network element itself.
  • a network-level fault recovery strategy can also be adopted at this time.
  • the number of service failures when the number of service failures is counted, the number of service failures caused by insufficient resources may be excluded, and is not included in the total number of statistics of service failures. Because the cause of this situation is mainly the excessive number of services, and the business processing unit itself usually does not fail.
  • the application also provides a fault recovery device.
  • FIG. 5 is a structural diagram of an embodiment of a fault recovery device of the present application. As shown in FIG. 5, the apparatus may include:
  • the obtaining unit 501 is configured to acquire key performance indicator information of each service processing unit in the monitored network element.
  • a determining unit 502 configured to determine a fault object according to the key performance indicator information
  • the sending unit 503 is configured to send the fault recovery policy to a management unit in the network function virtualization system, so that the management unit uses the fault recovery policy to perform fault recovery.
  • the key performance indicator information of each service processing unit in the monitored network element is obtained; the fault object is determined according to the key performance indicator information; and the fault recovery strategy is determined according to the fault object;
  • the recovery policy is sent to the management unit in the network function virtualization system; the fault can be located through the key performance indicator information, and the accuracy of the fault location based on the heartbeat message of the network element is solved.
  • the fault recovery strategy is determined according to the fault object; the fault recovery policy is sent to the management unit in the network function virtualization system; therefore, an appropriate fault recovery strategy can be adopted to reduce the risk brought by the fault recovery process. Reduce the impact of the failure recovery process on the business.
  • the determining unit 502 may be specifically configured to:
  • the network element level fault recovery policy is used to: Performing a fault recovery operation inside the monitored network element.
  • the determining unit 502 may be specifically configured to:
  • the network level fault recovery strategy is used to One or more network elements in the network in which the monitored network element is located perform a fault recovery operation.
  • the determining unit 502 may be specifically configured to:
  • the service processing unit that determines that the service success rate is lower than the first reference value is the fault object.
  • the determining unit 502 may be specifically configured to:
  • the homogenization service processing unit is a service processing unit that is the same as the service logic of the service carried by the service processing unit, and the service is discretely allocated.
  • the determining unit 502 can also be used to:
  • the homogenization service processing unit is the same as the service logic of the service carried by the service processing unit, and the service processing unit to which the service is discretely allocated.
  • the determining unit 502 may be specifically configured to:
  • the communication path that determines that the service success rate is lower than the third reference value is the fault object.
  • the determining unit 502 may be specifically configured to:
  • the determining unit 502 may be specifically configured to:
  • the homogenized network element is a monitored network element whose service logic of the carried service is the same as the monitored network element, and the service is discretely allocated.
  • the sending unit 503 may be specifically configured to:
  • the sending unit 503 may be specifically configured to:
  • the determining unit 502 can also be used to:
  • the network level fault recovery policy is configured to perform a fault recovery operation on one or more network elements in the network in which the monitored network element is located.
  • the obtaining unit 501 can also be used to:
  • the determining unit 502 is further configured to determine, according to the state information, a redundant network element that is in a normal working state;
  • fault recovery indication information is used to instruct the management unit to replace the monitored network element determined to be a fault object by using the redundant network element in a normal working state
  • the acquiring unit 501 is further configured to: acquire state information of the redundant network element of the back-end network element in the communication path that is determined to be the fault object;
  • the determining unit 502 is further configured to: determine, according to the state information, a redundant network element that is in a normal working state;
  • the embodiment of the present application further provides a computing node, which may be a host server including computing power, or a personal computer PC, or a portable computer or terminal, etc., and the specific embodiment of the present application is not correct.
  • the specific implementation of the compute node is limited.
  • computing node 600 includes:
  • a processor 610 a communication interface 620, a memory 630, Bus 640.
  • the processor 610, the communication interface 620, and the memory 630 complete communication with each other via the bus 640.
  • the processor 610 is configured to execute the program 632.
  • program 632 can include program code, the program code including computer operating instructions.
  • the processor 610 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application.
  • CPU central processing unit
  • ASIC Application Specific Integrated Circuit
  • the memory 630 is configured to store the program 632.
  • Memory 630 may include high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.
  • the program 632 may specifically include the corresponding modules or units in the embodiment shown in FIG. 5, and details are not described herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Monitoring And Testing Of Exchanges (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本申请提供了一种故障恢复方法及装置。所述故障恢复方法包括:获取被监测网元中的各个业务处理单元的关键绩效指标信息;根据所述关键绩效指标信息,确定故障对象;根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。采用本申请的方法或装置,可以解决根据网元的心跳消息对于故障进行定位的精度较低的问题。

Description

一种故障恢复方法及装置
本申请要求于2015年9月22日提交中国专利局、申请号为201510608782.1、发明名称为“一种故障恢复方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及网络数据处理领域,特别是涉及一种故障恢复方法及装置。
背景技术
通信系统中,当设备出现故障时,需要采取一定的方法对故障进行恢复,以避免故障长时间无法恢复而对通信系统的性能造成严重影响。
对于故障的恢复方法,可以采用人工方式进行。但是,人工检测故障,再对故障进行恢复所花费的时间和人工成本通常较高。因此,业内逐渐倾向于通过通信系统中的设备对于通信系统自身的故障进行自动恢复,以此提高故障恢复效率并降低人工成本。
现有技术中的故障恢复方法,主要是根据设备的心跳消息来判断设备是否出现故障。具体的,监测设备可以定期向被监测设备发送心跳消息,被监测设备在接收到心跳消息后,可以向监测设备返回响应消息。如果监测设备在发送心跳消息后,在规定时间内未接收到被监测设备返回的响应消息,则判定该被监测设备发生故障,进而对该被监测设备整体进行复位,或者将该被监测设备所承载的功能倒换至另外的设备,以恢复故障。
然而,监测设备在规定时间内未接收到响应消息的原因,可能有多种。例如,可以是被监测设备用于发送响应消息的接口单元发生故障。此时,可以调用被监测设备的其它接口单元替换前述接口单元,而无需对被监测设备整体进行复位或者功能倒换。对被监测设备整体进行复位或者功能倒换的风险较高,且会造成较多的业务受到影响。
综上所述,现有技术中的故障恢复方法,由于根据设备的心跳消息对于故障进行分析和恢复,导致对于故障进行定位的精度较低。
发明内容
本申请的目的是提供一种故障恢复方法及装置,能够通过关键绩效指标信息对于故障进行定位,解决根据设备的心跳消息对于故障进行定位的精度较低问题。
为实现上述目的,本申请提供了如下方案:
根据本申请的第一方面的第一种可能的实现方式,本申请提供一种故障恢复方法,包括:
获取被监测网元中的各个业务处理单元的关键绩效指标信息;
根据所述关键绩效指标信息,确定故障对象;
根据所述故障对象,确定故障恢复策略;
将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。
结合第一方面的第二种可能的实现方式,所述确定故障对象,具体包括:
确定故障对象为所述被监测网元中的业务处理单元;
或者确定故障对象为所述业务处理单元之间的通信路径;
所述根据所述故障对象,确定故障恢复策略,具体包括:
当所述故障对象为所述被监测网元中的业务处理单元或者所述业务处理单元之间的通信路径时,确定网元级的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
结合第一方面的第三种可能的实现方式,所述确定故障对象,具体包括:
确定故障对象为所述被监测网元;
或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径;
所述根据所述故障对象,确定故障恢复策略,具体包括:
当所述故障对象为所述被监测网元或者所述被监测网元与另外的网元之间的通信路径时,确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
结合第一方面的第二种可能的实现方式的第一种具体实现方式,所述确定故障对象为所述被监测网元中的业务处理单元,具体包括:
根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
将所述业务成功率与第一参考值进行比较;
确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象。
结合第一方面的第二种可能的实现方式的第一种具体实现方式的第一种更具体的实现方式,所述将所述业务成功率与第一参考值进行比较,具体包括:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化业务处理单元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配的业务处理单元。
结合第一方面的第二种可能的实现方式的第一种具体实现方式的第二种更具体的实现方式,确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象之前,还包括:
确定同质化业务处理单元中业务成功率大于第一参考值的第一单元集;
确定同质化业务处理单元中业务成功率小于第一参考值的第二单元集;
确定所述第一单元集包含的单元在全体所述同质化业务处理单元中所占的比例大于第一预设比例;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配至的业务处理单元。
结合第一方面的第二种可能的实现方式的第二种具体实现方式,所述确定故障对象为所述业务处理单元之间的通信路径,具体包括:
根据所述关键绩效指标信息中的通信路径故障导致的业务失败数,计算通信路径的业务成功率;
将所述业务成功率与第三参考值进行比较;
确定所述业务成功率低于第三参考值的通信路径为所述故障对象。
结合第一方面的第三种可能的实现方式的第一种具体实现方式,所述确定故障对象为所述被监测网元,具体包括:
根据各个业务处理单元的所述关键绩效指标信息中的各个业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述各个业务处理单元的业务成功率;
将所述业务成功率与第二参考值进行比较;
确定所述业务成功率低于第二参考值的业务处理单元的数量;
根据所述数量确定所述业务成功率低于第二参考值的业务处理单元在所述被监测网元中的全部业务处理单元中所占的比例;
当所述比例大于第二预设比例时,确定所述被监测网元为所述故障对象。
结合第一方面的第三种可能的实现方式的第一种具体实现方式的第一种更具体的实现方式,所述将所述业务成功率与第二参考值进行比较,具体包括:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化网元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化网元为承载的业务的业务逻辑与所述被监测网元相同,且所述业务被离散分配至的被监测网元。
结合第一方面的第二种可能的实现方式的第三种具体实现方式,所述确定故障对象为所述被监测网元中的业务处理单元之后,或者,确定故障对象为所述业务处理单元之间的通信路径之后,所述将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,具体包括:
将所述故障恢复策略发送至网络功能虚拟化系统中所述被监测网元中的的系统管理模块。
结合第一方面的第三种可能的实现方式的第二种具体实现方式,所述确定故障对象为所述被监测网元之后,或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径之后,所述将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,具体包括:
将所述故障恢复策略发送至网络功能虚拟化系统中的管理和编排MANO单元。
结合第一方面的第二种可能的实现方式的第四种具体实现方式,所述确定故障对象为所述被监测网元中的业务处理单元之后,还包括:
确定发生故障的所述业务处理单元的数量达到预设阈值;
确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
结合第一方面的第三种可能的实现方式的第三种具体实现方式,所述确定网络级的故障恢复策略,具体包括:
获取与被确定为故障对象的所述被监测网元相关的冗余网元的状态信息;
根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元采用所述处于正常工作状态的冗余网元替换被确定为故障对象的所述被监测网元;
或者,所述确定网络级的故障恢复策略,具体包括:获取被确定为故障对象的所述通信路径中的后端网元的冗余网元的状态信息;
根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元将所述通信路径中的前端网元对应的后端网元切换为所述处于正常工作状态的冗余网元。
根据本申请的第二方面的第一种可能的实现方式,本申请提供一种故障恢复装置,包括:
获取单元,用于获取被监测网元中的各个业务处理单元的关键绩效指标信息;
确定单元,用于根据所述关键绩效指标信息,确定故障对象;
根据所述故障对象,确定故障恢复策略;
发送单元,用于将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。
结合第二方面的第二种可能的实现方式,所述确定单元,具体用于:
确定故障对象为所述被监测网元中的业务处理单元;
或者确定故障对象为所述业务处理单元之间的通信路径;
当所述故障对象为所述被监测网元中的业务处理单元或者所述业务处理单元之间的通信路径时,确定网元级的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
结合第二方面的第三种可能的实现方式,所述确定单元,具体用于:
确定故障对象为所述被监测网元;
或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径;
当所述故障对象为所述被监测网元或者所述被监测网元与另外的网元之间的通信路径时,确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
结合第二方面的第二种可能的实现方式的第一种具体的实现方式,所述确定单元,具体用于:
根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
将所述业务成功率与第一参考值进行比较;
确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象。
结合第二方面的第二种可能的实现方式的第一种具体的实现方式的第一种更具体的实现方式,所述确定单元,具体用于:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化业务处理单元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配的业务处理单元。
结合第二方面的第二种可能的实现方式的第一种具体的实现方式的第二种更具体的实现方式,所述确定单元还用于:
在确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象之前,确定同质化业务处理单元中业务成功率大于第一参考值的第一单元集;
确定同质化业务处理单元中业务成功率小于第一参考值的第二单元集;
确定所述第一单元集包含的单元在全体所述同质化业务处理单元中所占的比例大于第一预设比例;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配至的业务处理单元。
结合第二方面的第二种可能的实现方式的第二种具体的实现方式,所述确定单元,具体用于:
根据所述关键绩效指标信息中的通信路径故障导致的业务失败数,计算通信路径的业务成功率;
将所述业务成功率与第三参考值进行比较;
确定所述业务成功率低于第三参考值的通信路径为所述故障对象。
结合第二方面的第三种可能的实现方式的第一种具体的实现方式,所述确定单元,具体用于:
根据各个业务处理单元的所述关键绩效指标信息中的各个业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述各个业务处理单元的业务成功率;
将所述业务成功率与第二参考值进行比较;
确定所述业务成功率低于第二参考值的业务处理单元的数量;
根据所述数量确定所述业务成功率低于第二参考值的业务处理单元在所述被监测网元中的全部业务处理单元中所占的比例;
当所述比例大于第二预设比例时,确定所述被监测网元为所述故障对象。
结合第二方面的第三种可能的实现方式的第一种具体的实现方式的第一种更具体的实现方式,所述确定单元,具体用于:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化网元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化网元为承载的业务的业务逻辑与所述被监测网元相同,且所述业务被离散分配至的被监测网元。
结合第二方面的第二种可能的实现方式的第三种具体的实现方式,所述发送单元,具体用于:
确定故障对象为所述被监测网元中的业务处理单元之后,或者,确定故障对象为所述业务处理单元之间的通信路径之后,将所述故障恢复策略发送至网络功能虚拟化系统中所述被监测网元中的的系统管理模块。
结合第二方面的第三种可能的实现方式的第二种具体的实现方式,所述发送单元,具体用于:
确定故障对象为所述被监测网元之后,或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径之后,将所述故障恢复策略发送至网络功能虚拟化系统中的管理和编排MANO单元。
结合第二方面的第二种可能的实现方式的第四种具体的实现方式,所述确定单元还用于:
在确定故障对象为所述被监测网元中的业务处理单元之后,确定发生故障的所述业务处理单元的数量达到预设阈值;
确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
结合第二方面的第三种可能的实现方式的第三种具体的实现方式,所述获取单元还用于:
获取与被确定为故障对象的所述被监测网元相关的冗余网元的状态信息;
所述确定单元,还用于根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元采用所述处于正常工作状态的冗余网元替换被确定为故障对象的所述被监测网元;
或者,所述获取单元还用于,获取被确定为故障对象的所述通信路径中的后端网元的冗余网元的状态信息;
所述确定单元还用于,根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元将所述通信路径中的前端网元对应的后端网元切换为所述处于正常工作状态的冗余网元。
根据本申请提供的具体实施例,本申请公开了以下技术效果:
本申请公开的故障恢复方法或装置,通过获取被监测网元中的各个业务处理单元的关键绩效指标信息;根据所述关键绩效指标信息,确定故障对象;根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元;可以通过关键绩效指标信息对于故障进行定位,解决根据网元的心跳消息对于故障进行定位的精度较低的问题。
此外,由于根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元;所以可以采用适当的故障恢复策略,减小故障恢复过程带来的风险,降低故障恢复过程对业务的影响。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他 的附图。
图1为本申请的网络功能虚拟化(NFV)系统的架构图;
图2为本申请的故障恢复方法实施例1的流程图;
图3为本申请的故障恢复方法实施例2的流程图;
图4为本申请的故障恢复方法实施例3的流程图;
图5为本申请的故障恢复装置实施例的结构图;
图6为本申请的计算节点的结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。
图1为本申请的网络功能虚拟化(NFV)系统的架构图。本申请的故障恢复方法主要应用于NFV系统中。如图1所示,NFV系统中主要包括如下网元:
运营支撑系统(Operations Support System,OSS)/业务支撑系统(Business Support System,BSS),用于向网络功能虚拟化编排器(NFV Orchestrator)发起服务请求,及服务所需的资源,负责故障处理。
编排器(Orchestrator),负责根据OSS/BSS的服务请求,实现NFV服务;负责网络服务(Network Service,NS)的生命周期管理,编排管理资源,实时监测虚拟化网络功能(Virtualized Network Function,VNF)、网络功能虚拟化基础设施(Network Function Virtulization Infrastructure,NFVI)资源及运行状态信息。
虚拟化网络功能管理器(VNF Manager,VNFM),负责VNF生成周期管理,如启动、生存时间、VNF运行状态信息。
虚拟化基础设施管理器(Virtualized Infrastructure Manager,VIM),负责管理、分配NFVI的资源;监测收集NFVI运行状态信息。
网元管理系统(Element Management System,EMS),负责网元的故障管理,配置管理,计费管理,性能管理,安全管理(Fault Management,Configurat ion Management,Account ing Management,Performance Management,Security Management,FCAPS)。
NFVI资源:包括所有NFVI资源状态,可用的/已预留的/已分配的NFVI资源。
本申请的故障恢复方法的执行主体可以是网元关键绩效指标(Key Performance  Indicator,KPI)监控与恢复决策模块或者。所述网元KPI监控与恢复决策模块或者网络KPI监控与恢复决策模块可以部署在NFV系统中的VNF、EMS、管理和编排(Management and Orchestrator,MANO)单元、或独立的网络节点上。两者物理上可以合一部署,也可以分开部署。
图2为本申请的故障恢复方法实施例1的流程图。本实施例的方法的执行主体可以是网元KPI监控与恢复决策模块或者网络KPI监控与恢复决策模块。如图2所示,该方法可以包括:
步骤101:获取被监测网元中的各个业务处理单元的关键绩效指标(Key Performance Indicator,KPI)信息;
所述被监测网元可以是网络功能虚拟化(Network Function Virtualization,NFV)系统中的网元,例如VNF。
所述被监测网元中可以具有一个或多个业务处理单元。
所述关键绩效指标信息可以包括业务处理单元接收到的业务请求数、所述业务请求数对应的业务的失败数、和/或每次业务失败的原因等信息。实际应用中,所述关键绩效指标信息中包含的信息种类,可以根据需求进行设定。例如,所述关键绩效指标信息中还可以包括业务时延信息等。
所述被监测网元可以周期性上报所述关键绩效指标信息。
需要说明的是,在执行步骤101之前,还可以根据EMS、和/或MANO的信息,确定需要监测的网元。可以获取EMS、和/或MANO记录的网元内部署的业务处理单元信息,以及网络上部署的网元信息,将记录的网络上部署的网元信息对应的网元,确定为所述被监测网元;将记录的网元内部署的业务处理单元信息对应的业务处理单元,确定为需要被监测的业务处理单元。
步骤102:根据所述关键绩效指标信息,确定故障对象;
例如,根据所述关键绩效指标信息可以计算业务处理单元执行业务的成功率。当所述成功率低于一定比例时,可以确定故障对象为所述业务处理单元。当成功率较低的业务处理单元的个数较多(例如超过被监测网元的业务处理单元的总数的80%)时,可以确定故障对象为所述被监测网元外部的网元。又例如,当所述关键绩效指标信息中记录的被监测网元到下一级网元通信超时导致的业务失败数较高时,可以确定被监测网元至下一级网元的通信路径发生故障或者下一级网元发生故障。
步骤103:根据所述故障对象,确定故障恢复策略;
当所述故障对象为所述被监测网元内部的业务处理单元时,可以确定网元级的的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
当所述故障对象为所述被监测网元外部的网元时,可以确定网络级的的故障恢复策略; 所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
步骤104:将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。
所述管理单元可以是网络功能虚拟化系统中所述被监测网元中的的系统管理模块,也可以是网络功能虚拟化系统中的管理和编排MANO单元。
采用网元级的故障恢复策略,进行故障恢复,可以包括以下方式:
确定发生故障的所述业务处理单元的备用单元;将发生故障的所述业务处理单元承载的业务切换至所述备用单元。
或者,对发生故障的所述业务处理单元进行复位。
其中,当所述备用单元出现故障时,可以对发生故障的所述业务处理单元以及所述备用单元进行隔离。
采用网络级的故障恢复策略,进行故障恢复,可以包括以下方式:
确定发生故障的所述网元的备用网元;
将发生故障的所述网元承载的业务切换至所述备用网元。
或者,确定发生故障的所述路径的备用路径;
将发生故障的所述路径承载的业务切换至所述备用路径。
其中,当确定所述备用路径发生故障时,还可以确定所述备用路径一端的网元的备用网元;
将所述备用路径一端的网元承载的业务切换至所述备用网元。
综上所述,本实施例中,通过获取被监测网元中的各个业务处理单元的关键绩效指标信息;根据所述关键绩效指标信息,确定故障对象;根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元;可以通过关键绩效指标信息对于故障进行定位,解决根据网元的心跳消息对于故障进行定位的精度较低的问题。此外,由于根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元;所以可以采用适当的故障恢复策略,减小故障恢复过程带来的风险,降低故障恢复过程对业务的影响。
实际应用中,所述确定故障对象,具体可以包括:
确定故障对象为所述被监测网元中的业务处理单元;
或者确定故障对象为所述业务处理单元之间的通信路径;
所述根据所述故障对象,确定故障恢复策略,具体可以包括:
当所述故障对象为所述被监测网元中的业务处理单元或者所述业务处理单元之间的通信 路径时,确定网元级的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
实际应用中,所述确定故障对象,具体还可以包括:
确定故障对象为所述被监测网元。
或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径。
所述根据所述故障对象,确定故障恢复策略,具体可以包括:
当所述故障对象为所述被监测网元或者所述被监测网元与另外的网元之间的通信路径时,确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
需要说明的是,基于本申请实施例中的方法,在实际应用中,对于网元级的故障,可以先采用网元级的故障恢复策略进行故障恢复;如果恢复失败,可以再采用网络级的故障恢复策略进行故障恢复。
实际应用中,所述确定故障对象为所述被监测网元中的业务处理单元,具体可以采用以下步骤:
根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
将所述业务成功率与参考值进行比较;
确定所述业务成功率低于参考值的业务处理单元为所述故障对象。
上述步骤中,所述业务失败数,可以是由于业务处理单元自身原因导致的业务失败数。具体的,所述关键绩效指标信息可以记录业务失败的原因,可以根据业务失败的原因,统计由于业务处理单元自身原因导致的业务失败数。
还需要说明的是,上述步骤中,所述参考值可以是预先设定的数值,也可以是根据同质化业务处理单元的平均业务成功率统计得到的同质化参考值。因此,所述将所述业务成功率与参考值进行比较,具体可以包括:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化业务处理单元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配至的业务处理单元。
需要说明的是,对于同质化业务处理单元,有时会存在以下现象:由于某种原因,导致多个同质化业务处理单元的业务成功率均低于预设参考值,此时并不一定是低于预设参考值 的同质化业务处理单元发生故障。可以是其他设备故障导致的大多数同质化业务处理单元的业务成功率下降。在上述情况中,为了避免错误地确定同质化业务处理单元发生故障,确定所述业务成功率低于参考值的业务处理单元为所述故障对象之前,还可以采用以下步骤:
确定同质化业务处理单元中业务成功率大于预设参考值的第一单元集;
确定同质化业务处理单元中业务成功率小于预设参考值的第二单元集;
确定所述第一单元集包含的单元在全体所述同质化业务处理单元中所占的比例大于预设比例。
上述步骤中,所述预设比例可以根据实际需求进行设置,例如可以设置为90%。即,当90%或以上的同质化业务处理单元的业务成功率高于预设参考值,而10%或以下的同质化业务处理单元的业务成功率低于预设参考值时,可以确定所述业务成功率低于参考值的同质化业务处理单元为所述故障对象。
实际应用中,所述确定故障对象为所述业务处理单元之间的通信路径,具体可以包括:
根据所述关键绩效指标信息中的通信路径故障导致的业务失败数,计算通信路径的业务成功率;
将所述业务成功率与参考值进行比较;
确定所述业务成功率低于参考值的通信路径为所述故障对象。
实际应用中,所述确定故障对象为所述被监测网元所归属的网络中的网元,具体可以包括:
根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述被监测网元的业务成功率;
将所述业务成功率与参考值进行比较;
确定所述业务成功率低于参考值的所述被监测网元为所述故障对象。
需要说明的是,一个网元中可以包括多个业务处理单元。因此,可以获取一个网元中各个业务处理单元的关键绩效指标信息;根据各个业务处理单元的关键绩效指标信息中包含的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述网元接收到的业务请求数以及所述业务请求数对应的业务的失败数,进而计算所述被监测网元的业务成功率。
实际应用中,所述将所述业务成功率与参考值进行比较,具体可以包括:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化网元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化网元为承载的业务的业务逻辑与所述被监测网元相同,且所述业务被离散分配至的被监测网元。
图3为本申请的故障恢复方法实施例2的流程图。本实施例的方法的执行主体可以是网元KPI监控与恢复决策模块。如图3所示,该方法可以包括:
步骤201:获取被监测网元中的各个业务处理单元的关键绩效指标信息;
本实施例中,所述业务处理单元可以包括线程、进程、虚拟机(Virtual Machine,VM)等。所述关键绩效指标信息至少可以包括以下信息:业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数。
步骤202:根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
所述业务成功率可以采用所述业务请求数减去所述失败数,再除以所述业务请求数,乘以100%得到。
步骤203:将所述业务成功率与参考值进行比较;
所述参考值可以根据实际需求进行设置。例如,当正常的业务处理单元的业务成功率在95%以上时,所述参考值可以设置为95%。
或者,所述参考值可以根据同质化业务处理单元的平均业务成功率进行计算得到。其中,所述同质化业务处理单元为与所述业务成功率对应的业务处理单元所承载的业务逻辑相同且外部业务组网也相同的业务处理单元。多个同质化业务处理单元接收到(被分发)的业务请求消息是随机离散的。因此,多个同质化业务处理单元的业务成功率应该基本相似。所以,可以根据同质化业务处理单元的平均业务成功率进行计算得到同质化参考值。
具体的,可以将所述平均业务成功率减去预设数值得到同质化参考值。所述预设数值可以根据实际需求设置。例如,可以是20%,10%等等。
步骤204:确定所述业务成功率低于参考值的业务处理单元为所述故障对象。
步骤205:当所述故障对象为业务处理单元时,确定网元级的故障恢复策略;
步骤206:将所述故障恢复策略发送至网络功能虚拟化系统中所述被监测网元中的的系统管理模块。
步骤205中的所述网元级的故障恢复策略,可以是指示所述系统管理模块对发生故障的所述业务处理单元进行复位。所述系统管理模块接收到所述网元级的故障恢复策略后,可以对发生故障的所述业务处理单元进行复位。
需要说明的是,如果复位后的业务处理单元依然出现故障,还可以对发生故障的所述业务处理单元进行隔离。进一步的,当确定隔离的业务处理单元的数量达到第二预设阈值时,可以执行网络级的故障恢复策略;所述网络级的故障恢复策略用于在所述被监测网元所处网 络中的一个或多个网元执行故障恢复操作。例如,可以对被监测网元的下一跳故障网元或通信路径进行倒换等。倒换的目标网元或通信路径可以根据容灾组内各网元或通信路径的健康状态进行选取。
还需要说明的是,当发生故障的业务处理单元为主备型的业务处理单元时,所述故障恢复策略可以是:确定发生故障的所述业务处理单元的备用单元;将发生故障的所述业务处理单元承载的业务切换至所述备用单元。进一步的,当确定所述备用单元出现故障时,可以对发生故障的所述业务处理单元以及所述备用单元进行隔离。
图4为本申请的故障恢复方法实施例3的流程图。本实施例的方法的执行主体可以是网络KPI监控与恢复决策模块。如图4所示,该方法可以包括:
步骤301:获取被监测网元中的各个业务处理单元的关键绩效指标信息;
步骤302:根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算各个业务处理单元执行的业务的业务成功率;
步骤303:将所述业务成功率与参考值进行比较;
步骤304:确定所述业务成功率低于参考值的业务处理单元的数量;
步骤305:根据所述数量确定所述业务成功率低于参考值的业务处理单元在所述被监测网元中的全部业务处理单元中所占的比例;
假设业务成功率低于参考值的业务处理单元的数量为8个,所述被监测网元中的全部业务处理单元数量为10个,则所述比例为80%。
步骤306:当所述比例大于预设比例时,确定故障对象为所述被监测网元。
所述预设比例可以根据实际需求进行设置。例如,所述预设比例可以设置为50%、80%等等。
步骤307:当所述故障对象为所述被监测网元所归属的网络中的网元时,确定网络级的故障恢复策略;
当故障发生位置为所述被监测网元所归属的网络中的网元时,需要采用网络级的故障恢复策略,以便对发生故障的网元进行修复。
实际应用中,所述确定网络级的故障恢复策略,具体可以有多种实现方式。例如,可以采用以下步骤:
获取与被确定为故障对象的所述被监测网元相关的冗余网元的状态信息;
根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元采用所述处于正常工作状态的冗余网元替换被确定为故障对象的所述被监测网元。
上述步骤,可以确保用于替换发生故障的被监测网元的冗余网元是可以正常工作的。如 果被监测网元的冗余网元均出现异常,则可以不再采用预先设定的冗余网元替换发生故障的被监测网元,可以查找其它可以正常工作的网元来替换所述发生故障的被监测网元。
又例如,可以采用以下步骤:
获取被确定为故障对象的所述通信路径中的后端网元的冗余网元的状态信息;
根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元将所述通信路径中的前端网元对应的后端网元切换为所述处于正常工作状态的冗余网元。
上述步骤,可以确保切换后的冗余网元是可以正常工作的。如果所述通信路径中的后端网元的冗余网元均出现异常,则可以不再采用预先设定的冗余网元进行切换,可以查找其它可以正常工作的网元来进行切换。
步骤308:将所述故障恢复策略发送至网络功能虚拟化系统中的管理和编排MANO单元。
所述网络级的故障恢复策略可以指示MANO单元确定发生故障的所述网元的备用网元;将发生故障的所述网元承载的业务切换至所述备用网元。
MANO接收到所述网络级的故障恢复策略,可以确定发生故障的所述网元的备用网元。确定发生故障的所述网元的备用网元之后,MANO可以向VNFM发送指示信令,指示VNFM将发生故障的所述网元承载的业务切换至所述备用网元。VNFM收到所述指示信令后,可以将发生故障的所述网元承载的业务切换至所述备用网元。
还需要说明的是,本申请实施例中,所述关键绩效指标信息中还可以包括业务失败原因信息,以及该业务失败原因导致的业务失败次数信息。所述业务失败原因可以包括:到下游网元通信超时、资源不足、被监测网元的内部模块之间通信超时、软件内部错误(例如软件内部数据非法、代码走入异常分支等)等。因此,本申请中所述根据所述关键绩效指标信息,确定故障对象,具体还可以包括:
根据关键绩效指标信息中包含的业务失败原因信息,确定故障对象。
可以根据所述关键绩效指标信息中记录的由于业务处理超时导致的业务失败次数以及所述被监测网元发往下游网元的业务请求数,确定由于业务处理超时导致的失败业务比例;
当所述失败业务比例大于或等于预设阈值时,可以确定故障发生位置为所述被监测网元。所述被监测网元归属的网络中的网元可以包括所述网元的外部网元和所述网元本身。相应的,此时也可以采用网络级的故障恢复策略。
另外,对于前面提到的所述同质化业务处理单元,在统计业务失败数时,可以将资源不足导致的业务失败次数排除,不计入业务失败的统计总数中。因为这种情况的成因主要是业务数量过多,而业务处理单元自身通常并没有发生故障。
本申请还提供了一种故障恢复装置。
图5为本申请的故障恢复装置实施例的结构图。如图5所示,该装置可以包括:
获取单元501,用于获取被监测网元中的各个业务处理单元的关键绩效指标信息;
确定单元502,用于根据所述关键绩效指标信息,确定故障对象;
根据所述故障对象,确定故障恢复策略;
发送单元503,用于将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。
本实施例中,通过获取被监测网元中的各个业务处理单元的关键绩效指标信息;根据所述关键绩效指标信息,确定故障对象;根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元;可以通过关键绩效指标信息对于故障进行定位,解决根据网元的心跳消息对于故障进行定位的精度较低问题。此外,由于根据所述故障对象,确定故障恢复策略;将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元;所以可以采用适当的故障恢复策略,减小故障恢复过程带来的风险,降低故障恢复过程对业务的影响。
实际应用中,所述确定单元502,具体可以用于:
确定故障对象为所述被监测网元中的业务处理单元;
或者确定故障对象为所述业务处理单元之间的通信路径;
当所述故障对象为所述被监测网元中的业务处理单元或者所述业务处理单元之间的通信路径时,确定网元级的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
实际应用中,所述确定单元502,具体可以用于:
确定故障对象为所述被监测网元;
或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径;
当所述故障对象为所述被监测网元或者所述被监测网元与另外的网元之间的通信路径时,确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
实际应用中,所述确定单元502,具体可以用于:
根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
将所述业务成功率与第一参考值进行比较;
确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象。
实际应用中,所述确定单元502,具体可以用于:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化业务处理单元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配的业务处理单元。
实际应用中,所述确定单元502还可以用于:
在确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象之前,确定同质化业务处理单元中业务成功率大于第一参考值的第一单元集;
确定同质化业务处理单元中业务成功率小于第一参考值的第二单元集;
确定所述第一单元集包含的单元在全体所述同质化业务处理单元中所占的比例大于第一预设比例;
其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配至的业务处理单元。
实际应用中,所述确定单元502,具体可以用于:
根据所述关键绩效指标信息中的通信路径故障导致的业务失败数,计算通信路径的业务成功率;
将所述业务成功率与第三参考值进行比较;
确定所述业务成功率低于第三参考值的通信路径为所述故障对象。
实际应用中,所述确定单元502,具体可以用于:
根据各个业务处理单元的所述关键绩效指标信息中的各个业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述各个业务处理单元的业务成功率;
将所述业务成功率与第二参考值进行比较;
确定所述业务成功率低于第二参考值的业务处理单元的数量;
根据所述数量确定所述业务成功率低于第二参考值的业务处理单元在所述被监测网元中的全部业务处理单元中所占的比例;
当所述比例大于第二预设比例时,确定所述被监测网元为所述故障对象。
实际应用中,所述确定单元502,具体可以用于:
将所述业务成功率与预设参考值进行比较;
或者,确定同质化网元的平均业务成功率;
将所述平均业务成功率减去预设数值得到同质化参考值;
将所述业务成功率与所述同质化参考值进行比较;
其中,所述同质化网元为承载的业务的业务逻辑与所述被监测网元相同,且所述业务被离散分配至的被监测网元。
实际应用中,所述发送单元503,具体可以用于:
确定故障对象为所述被监测网元中的业务处理单元之后,或者,确定故障对象为所述业务处理单元之间的通信路径之后,将所述故障恢复策略发送至网络功能虚拟化系统中所述被监测网元中的的系统管理模块。
实际应用中,所述发送单元503,具体可以用于:
确定故障对象为所述被监测网元之后,或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径之后,将所述故障恢复策略发送至网络功能虚拟化系统中的管理和编排MANO单元。
实际应用中,所述确定单元502还可以用于:
在确定故障对象为所述被监测网元中的业务处理单元之后,确定发生故障的所述业务处理单元的数量达到预设阈值;
确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
实际应用中,所述获取单元501还可以用于:
获取与被确定为故障对象的所述被监测网元相关的冗余网元的状态信息;
所述确定单元502,还可以用于根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元采用所述处于正常工作状态的冗余网元替换被确定为故障对象的所述被监测网元;
或者,所述获取单元501还可以用于,获取被确定为故障对象的所述通信路径中的后端网元的冗余网元的状态信息;
所述确定单元502还可以用于,根据所述状态信息,确定处于正常工作状态的冗余网元;
生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元将所述通信路径中的前端网元对应的后端网元切换为所述处于正常工作状态的冗余网元。
另外,本申请实施例还提供了一种计算节点,计算节点可能是包含计算能力的主机服务器,或者是个人计算机PC,或者是可携带的便携式计算机或终端等等,本申请具体实施例并不对计算节点的具体实现做限定。
图6为本申请的计算节点的结构图。如图6所示,计算节点600包括:
处理器(processor)610,通信接口(Communications Interface)620,存储器(memory)630, 总线640。
处理器610,通信接口620,存储器630通过总线640完成相互间的通信。
处理器610,用于执行程序632。
具体地,程序632可以包括程序代码,所述程序代码包括计算机操作指令。
处理器610可能是一个中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本申请实施例的一个或多个集成电路。
存储器630,用于存放程序632。存储器630可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。程序632具体可以包括图5所示实施例中的相应模块或单元,在此不赘述。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的硬件平台的方式来实现,当然也可以全部通过硬件来实施,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案对背景技术做出贡献的全部或者部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本申请的限制。

Claims (26)

  1. 一种故障恢复方法,其特征在于,包括:
    获取被监测网元中的各个业务处理单元的关键绩效指标信息;
    根据所述关键绩效指标信息,确定故障对象;
    根据所述故障对象,确定故障恢复策略;
    将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。
  2. 根据权利要求1所述的方法,其特征在于,所述确定故障对象,具体包括:
    确定故障对象为所述被监测网元中的业务处理单元;
    或者确定故障对象为所述业务处理单元之间的通信路径;
    所述根据所述故障对象,确定故障恢复策略,具体包括:
    当所述故障对象为所述被监测网元中的业务处理单元或者所述业务处理单元之间的通信路径时,确定网元级的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
  3. 根据权利要求1所述的方法,其特征在于,所述确定故障对象,具体包括:
    确定故障对象为所述被监测网元;
    或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径;
    所述根据所述故障对象,确定故障恢复策略,具体包括:
    当所述故障对象为所述被监测网元或者所述被监测网元与另外的网元之间的通信路径时,确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
  4. 根据权利要求2所述的方法,其特征在于,所述确定故障对象为所述被监测网元中的业务处理单元,具体包括:
    根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
    将所述业务成功率与第一参考值进行比较;
    确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象。
  5. 根据权利要求4所述的方法,其特征在于,所述将所述业务成功率与第一参考值进行比较,具体包括:
    将所述业务成功率与预设参考值进行比较;
    或者,确定同质化业务处理单元的平均业务成功率;
    将所述平均业务成功率减去预设数值得到同质化参考值;
    将所述业务成功率与所述同质化参考值进行比较;
    其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配的业务处理单元。
  6. 根据权利要求4所述的方法,其特征在于,确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象之前,还包括:
    确定同质化业务处理单元中业务成功率大于第一参考值的第一单元集;
    确定同质化业务处理单元中业务成功率小于第一参考值的第二单元集;
    确定所述第一单元集包含的单元在全体所述同质化业务处理单元中所占的比例大于第一预设比例;
    其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配至的业务处理单元。
  7. 根据权利要求2所述的方法,其特征在于,所述确定故障对象为所述业务处理单元之间的通信路径,具体包括:
    根据所述关键绩效指标信息中的通信路径故障导致的业务失败数,计算通信路径的业务成功率;
    将所述业务成功率与第三参考值进行比较;
    确定所述业务成功率低于第三参考值的通信路径为所述故障对象。
  8. 根据权利要求3所述的方法,其特征在于,所述确定故障对象为所述被监测网元,具体包括:
    根据各个业务处理单元的所述关键绩效指标信息中的各个业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述各个业务处理单元的业务成功率;
    将所述业务成功率与第二参考值进行比较;
    确定所述业务成功率低于第二参考值的业务处理单元的数量;
    根据所述数量确定所述业务成功率低于第二参考值的业务处理单元在所述被监测网元中的全部业务处理单元中所占的比例;
    当所述比例大于第二预设比例时,确定所述被监测网元为所述故障对象。
  9. 根据权利要求8所述的的方法,其特征在于,所述将所述业务成功率与第二参考值进行比较,具体包括:
    将所述业务成功率与预设参考值进行比较;
    或者,确定同质化网元的平均业务成功率;
    将所述平均业务成功率减去预设数值得到同质化参考值;
    将所述业务成功率与所述同质化参考值进行比较;
    其中,所述同质化网元为承载的业务的业务逻辑与所述被监测网元相同,且所述业务被离散分配至的被监测网元。
  10. 根据权利要求2所述的方法,其特征在于,所述确定故障对象为所述被监测网元中的业务处理单元之后,或者,确定故障对象为所述业务处理单元之间的通信路径之后,所述将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,具体包括:
    将所述故障恢复策略发送至网络功能虚拟化系统中所述被监测网元中的的系统管理模块。
  11. 根据权利要求3所述的方法,其特征在于,所述确定故障对象为所述被监测网元之后,或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径之后,所述将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,具体包括:
    将所述故障恢复策略发送至网络功能虚拟化系统中的管理和编排MANO单元。
  12. 根据权利要求2所述的方法,其特征在于,所述确定故障对象为所述被监测网元中的业务处理单元之后,还包括:
    确定发生故障的所述业务处理单元的数量达到预设阈值;
    确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
  13. 根据权利要求3所述的方法,其特征在于,所述确定网络级的故障恢复策略,具体包括:
    获取与被确定为故障对象的所述被监测网元相关的冗余网元的状态信息;
    根据所述状态信息,确定处于正常工作状态的冗余网元;
    生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元采用所述处于正常工作状态的冗余网元替换被确定为故障对象的所述被监测网元;
    或者,所述确定网络级的故障恢复策略,具体包括:获取被确定为故障对象的所述通信路径中的后端网元的冗余网元的状态信息;
    根据所述状态信息,确定处于正常工作状态的冗余网元;
    生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元将所述通信路径中的前端网元对应的后端网元切换为所述处于正常工作状态的冗余网元。
  14. 一种故障恢复装置,其特征在于,包括:
    获取单元,用于获取被监测网元中的各个业务处理单元的关键绩效指标信息;
    确定单元,用于根据所述关键绩效指标信息,确定故障对象;
    根据所述故障对象,确定故障恢复策略;
    发送单元,用于将所述故障恢复策略发送至网络功能虚拟化系统中的管理单元,以便所述管理单元采用所述故障恢复策略进行故障恢复。
  15. 根据权利要求14所述的装置,其特征在于,所述确定单元,具体用于:
    确定故障对象为所述被监测网元中的业务处理单元;
    或者确定故障对象为所述业务处理单元之间的通信路径;
    当所述故障对象为所述被监测网元中的业务处理单元或者所述业务处理单元之间的通信路径时,确定网元级的故障恢复策略;所述网元级的故障恢复策略用于在所述被监测网元内部执行故障恢复操作。
  16. 根据权利要求14所述的装置,其特征在于,所述确定单元,具体用于:
    确定故障对象为所述被监测网元;
    或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径;
    当所述故障对象为所述被监测网元或者所述被监测网元与另外的网元之间的通信路径时,确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
  17. 根据权利要求15所述的装置,其特征在于,所述确定单元,具体用于:
    根据所述关键绩效指标信息中的业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,计算业务处理单元执行的业务的业务成功率;
    将所述业务成功率与第一参考值进行比较;
    确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象。
  18. 根据权利要求17所述的装置,其特征在于,所述确定单元,具体用于:
    将所述业务成功率与预设参考值进行比较;
    或者,确定同质化业务处理单元的平均业务成功率;
    将所述平均业务成功率减去预设数值得到同质化参考值;
    将所述业务成功率与所述同质化参考值进行比较;
    其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配的业务处理单元。
  19. 根据权利要求17所述的装置,其特征在于,所述确定单元还用于:
    在确定所述业务成功率低于第一参考值的业务处理单元为所述故障对象之前,确定同质化业务处理单元中业务成功率大于第一参考值的第一单元集;
    确定同质化业务处理单元中业务成功率小于第一参考值的第二单元集;
    确定所述第一单元集包含的单元在全体所述同质化业务处理单元中所占的比例大于第一预设比例;
    其中,所述同质化业务处理单元为与所述业务处理单元所承载的业务的业务逻辑相同,且所述业务被离散分配至的业务处理单元。
  20. 根据权利要求15所述的装置,其特征在于,所述确定单元,具体用于:
    根据所述关键绩效指标信息中的通信路径故障导致的业务失败数,计算通信路径的业务成功率;
    将所述业务成功率与第三参考值进行比较;
    确定所述业务成功率低于第三参考值的通信路径为所述故障对象。
  21. 根据权利要求16所述的装置,其特征在于,所述确定单元,具体用于:
    根据各个业务处理单元的所述关键绩效指标信息中的各个业务处理单元接收到的业务请求数以及所述业务请求数对应的业务的失败数,统计所述各个业务处理单元的业务成功率;
    将所述业务成功率与第二参考值进行比较;
    确定所述业务成功率低于第二参考值的业务处理单元的数量;
    根据所述数量确定所述业务成功率低于第二参考值的业务处理单元在所述被监测网元中的全部业务处理单元中所占的比例;
    当所述比例大于第二预设比例时,确定所述被监测网元为所述故障对象。
  22. 根据权利要求21所述的的装置,其特征在于,所述确定单元,具体用于:
    将所述业务成功率与预设参考值进行比较;
    或者,确定同质化网元的平均业务成功率;
    将所述平均业务成功率减去预设数值得到同质化参考值;
    将所述业务成功率与所述同质化参考值进行比较;
    其中,所述同质化网元为承载的业务的业务逻辑与所述被监测网元相同,且所述业务被离散分配至的被监测网元。
  23. 根据权利要求15所述的装置,其特征在于,所述发送单元,具体用于:
    确定故障对象为所述被监测网元中的业务处理单元之后,或者,确定故障对象为所述业务处理单元之间的通信路径之后,将所述故障恢复策略发送至网络功能虚拟化系统中所述被监测网元中的的系统管理模块。
  24. 根据权利要求16所述的装置,其特征在于,所述发送单元,具体用于:
    确定故障对象为所述被监测网元之后,或者,确定故障对象为所述被监测网元与另外的网元之间的通信路径之后,将所述故障恢复策略发送至网络功能虚拟化系统中的管理和编排 MANO单元。
  25. 根据权利要求15所述的装置,其特征在于,所述确定单元还用于:
    在确定故障对象为所述被监测网元中的业务处理单元之后,确定发生故障的所述业务处理单元的数量达到预设阈值;
    确定网络级的故障恢复策略;所述网络级的故障恢复策略用于对所述被监测网元所处网络中的一个或多个网元执行故障恢复操作。
  26. 根据权利要求16所述的装置,其特征在于,所述获取单元还用于:
    获取与被确定为故障对象的所述被监测网元相关的冗余网元的状态信息;
    所述确定单元,还用于根据所述状态信息,确定处于正常工作状态的冗余网元;
    生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元采用所述处于正常工作状态的冗余网元替换被确定为故障对象的所述被监测网元;
    或者,所述获取单元还用于,获取被确定为故障对象的所述通信路径中的后端网元的冗余网元的状态信息;
    所述确定单元还用于,根据所述状态信息,确定处于正常工作状态的冗余网元;
    生成网络级的故障恢复指示信息,所述故障恢复指示信息用于指示所述管理单元将所述通信路径中的前端网元对应的后端网元切换为所述处于正常工作状态的冗余网元。
PCT/CN2016/098344 2015-09-22 2016-09-07 一种故障恢复方法及装置 WO2017050130A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP16848012.7A EP3340535B1 (en) 2015-09-22 2016-09-07 Failure recovery method and device
JP2018514977A JP6556346B2 (ja) 2015-09-22 2016-09-07 トラブルシューティング方法及び装置
US15/928,367 US10601643B2 (en) 2015-09-22 2018-03-22 Troubleshooting method and apparatus using key performance indicator information

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510608782.1 2015-09-22
CN201510608782.1A CN105187249B (zh) 2015-09-22 2015-09-22 一种故障恢复方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/928,367 Continuation US10601643B2 (en) 2015-09-22 2018-03-22 Troubleshooting method and apparatus using key performance indicator information

Publications (1)

Publication Number Publication Date
WO2017050130A1 true WO2017050130A1 (zh) 2017-03-30

Family

ID=54909103

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/098344 WO2017050130A1 (zh) 2015-09-22 2016-09-07 一种故障恢复方法及装置

Country Status (5)

Country Link
US (1) US10601643B2 (zh)
EP (1) EP3340535B1 (zh)
JP (1) JP6556346B2 (zh)
CN (1) CN105187249B (zh)
WO (1) WO2017050130A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015681A (zh) * 2020-08-19 2020-12-01 苏州鑫信腾科技有限公司 一种io端口的处理方法、装置、设备和介质
CN115834332A (zh) * 2022-11-23 2023-03-21 中国联合网络通信集团有限公司 一种故障处理方法、服务器及系统

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105187249B (zh) * 2015-09-22 2018-12-07 华为技术有限公司 一种故障恢复方法及装置
CN105681077B (zh) * 2015-12-31 2019-04-05 华为技术有限公司 故障处理方法、装置及系统
CN105760214B (zh) * 2016-04-19 2019-02-26 华为技术有限公司 一种设备状态及资源信息监测方法、相关设备及系统
JP6690093B2 (ja) * 2016-08-10 2020-04-28 富士通株式会社 判定プログラム、通信装置、および、判定方法
US11277420B2 (en) * 2017-02-24 2022-03-15 Ciena Corporation Systems and methods to detect abnormal behavior in networks
CN109905261A (zh) * 2017-12-08 2019-06-18 华为技术有限公司 故障诊断方法及装置
CN109995574A (zh) * 2018-01-02 2019-07-09 中兴通讯股份有限公司 一种修复vnfm故障的方法、监测器、vim、vnfm及存储介质
US10972588B2 (en) 2018-06-27 2021-04-06 T-Mobile Usa, Inc. Micro-level network node failover system
CN110750354B (zh) * 2018-07-24 2023-01-10 中国移动通信有限公司研究院 一种vCPU资源分配方法、装置和计算机可读存储介质
WO2020033697A1 (en) * 2018-08-09 2020-02-13 Intel Corporation Performance measurements for 5gc network functions
US11374849B1 (en) * 2020-12-18 2022-06-28 Versa Networks, Inc. High availability router switchover decision using monitoring and policies
CN112995051B (zh) * 2021-02-05 2022-08-09 中国工商银行股份有限公司 网络流量恢复方法及装置
CN116783873A (zh) * 2021-02-19 2023-09-19 英特尔公司 下一代系统的数据管理和后台数据传送策略控制的性能测量
WO2022232038A1 (en) * 2021-04-26 2022-11-03 Intel Corporation Performance measurements for unified data repository (udr)
WO2022264289A1 (ja) * 2021-06-15 2022-12-22 楽天モバイル株式会社 ネットワーク管理装置、ネットワーク管理方法およびプログラム
CN113766444B (zh) * 2021-09-23 2023-07-04 中国联合网络通信集团有限公司 故障定位方法、装置及设备
CN116757679B (zh) * 2023-08-11 2024-02-06 南方电网调峰调频发电有限公司检修试验分公司 检修策略的确定方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104468688A (zh) * 2013-09-13 2015-03-25 株式会社Ntt都科摩 用于网络虚拟化的方法和设备
WO2015061353A1 (en) * 2013-10-21 2015-04-30 Nyansa, Inc. A system and method for observing and controlling a programmable network using a remote network manager
WO2015109443A1 (zh) * 2014-01-21 2015-07-30 华为技术有限公司 网络服务故障处理方法,服务管理系统和系统管理模块
CN105187249A (zh) * 2015-09-22 2015-12-23 华为技术有限公司 一种故障恢复方法及装置

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8285121B2 (en) * 2007-10-07 2012-10-09 Fall Front Wireless Ny, Llc Digital network-based video tagging system
US20110122761A1 (en) * 2009-11-23 2011-05-26 Sundar Sriram KPI Driven High Availability Method and apparatus for UMTS radio access networks
CN101984697A (zh) * 2010-10-19 2011-03-09 中兴通讯股份有限公司 一种无线数据业务排障方法及系统
CN102111797A (zh) * 2011-02-15 2011-06-29 大唐移动通信设备有限公司 一种故障的诊断方法和设备
CN102158360B (zh) * 2011-04-01 2013-10-30 华中科技大学 一种基于时间因子因果关系定位的网络故障自诊断方法
CN103457792B (zh) * 2013-08-19 2017-02-08 大唐移动通信设备有限公司 一种故障检测方法和装置
CN104685830B (zh) * 2013-09-30 2018-03-06 华为技术有限公司 故障管理的方法、实体和系统
CN104796277B (zh) * 2014-01-21 2018-12-07 中国移动通信集团湖南有限公司 一种网络故障监测方法及装置
US10664297B2 (en) * 2014-02-24 2020-05-26 Hewlett Packard Enterprise Development Lp Activating pre-created VNFCs when a monitored performance level of a VNF exceeds a maximum value attainable by the combined VNFCs that form a VNF
US9401851B2 (en) * 2014-03-28 2016-07-26 Verizon Patent And Licensing Inc. Network management system
US10447555B2 (en) * 2014-10-09 2019-10-15 Splunk Inc. Aggregate key performance indicator spanning multiple services
US9674046B2 (en) * 2014-10-21 2017-06-06 At&T Intellectual Property I, L.P. Automatic detection and prevention of network overload conditions using SDN
US9584377B2 (en) * 2014-11-21 2017-02-28 Oracle International Corporation Transparent orchestration and management of composite network functions
WO2016103006A1 (en) * 2014-12-23 2016-06-30 Telefonaktiebolaget Lm Ericsson (Publ) Media performance monitoring and analysis
US9769065B2 (en) * 2015-05-06 2017-09-19 Telefonaktiebolaget Lm Ericsson (Publ) Packet marking for L4-7 advanced counting and monitoring
US10680896B2 (en) * 2015-06-16 2020-06-09 Hewlett Packard Enterprise Development Lp Virtualized network function monitoring
US10284434B1 (en) * 2016-06-29 2019-05-07 Sprint Communications Company L.P. Virtual network function (VNF) relocation in a software defined network (SDN)

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104468688A (zh) * 2013-09-13 2015-03-25 株式会社Ntt都科摩 用于网络虚拟化的方法和设备
WO2015061353A1 (en) * 2013-10-21 2015-04-30 Nyansa, Inc. A system and method for observing and controlling a programmable network using a remote network manager
WO2015109443A1 (zh) * 2014-01-21 2015-07-30 华为技术有限公司 网络服务故障处理方法,服务管理系统和系统管理模块
CN105187249A (zh) * 2015-09-22 2015-12-23 华为技术有限公司 一种故障恢复方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3340535A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015681A (zh) * 2020-08-19 2020-12-01 苏州鑫信腾科技有限公司 一种io端口的处理方法、装置、设备和介质
CN112015681B (zh) * 2020-08-19 2022-08-26 苏州鑫信腾科技有限公司 一种io端口的处理方法、装置、设备和介质
CN115834332A (zh) * 2022-11-23 2023-03-21 中国联合网络通信集团有限公司 一种故障处理方法、服务器及系统

Also Published As

Publication number Publication date
EP3340535B1 (en) 2020-07-29
JP6556346B2 (ja) 2019-08-07
EP3340535A4 (en) 2018-07-25
US20180212819A1 (en) 2018-07-26
CN105187249B (zh) 2018-12-07
US10601643B2 (en) 2020-03-24
CN105187249A (zh) 2015-12-23
JP2018533280A (ja) 2018-11-08
EP3340535A1 (en) 2018-06-27

Similar Documents

Publication Publication Date Title
WO2017050130A1 (zh) 一种故障恢复方法及装置
US10073729B2 (en) Fault management method, entity, and system
TWI746512B (zh) 實體機器故障分類處理方法、裝置和虛擬機器恢復方法、系統
US10432533B2 (en) Automatic detection and prevention of network overload conditions using SDN
CN101800675B (zh) 故障监控方法、监控设备及通信系统
CN110149220B (zh) 一种管理数据传输通道的方法及装置
WO2015154246A1 (zh) 基于网络功能虚拟化的故障处理方法及装置、系统
US20140372805A1 (en) Self-healing managed customer premises equipment
US20210105179A1 (en) Fault management method and related apparatus
US20170351560A1 (en) Software failure impact and selection system
US20170299645A1 (en) Fault Analysis Method and Apparatus Based on Data Center
CN113300917B (zh) Open Stack租户网络的流量监控方法、装置
WO2019079961A1 (zh) 一种确定共享风险链路组的方法及装置
US10547529B2 (en) Availability counting apparatus and method
US10033608B1 (en) Network interface port management
US20220334911A1 (en) Method, electronic device, and computer product for storage management
CN113760459A (zh) 虚拟机故障检测方法、存储介质和虚拟化集群
EP4057582A1 (en) Device management method and apparatus
TWI831540B (zh) 信令風暴之阻斷方法
KR100939352B1 (ko) 서비스 장애 감시 장치 및 방법
CN117579467A (zh) 双活容灾方法、装置及存储介质
CN114595208A (zh) 一种数据库切换方法、装置和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16848012

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018514977

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2016848012

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE