WO2017190339A1 - 故障处理方法及装置 - Google Patents

故障处理方法及装置 Download PDF

Info

Publication number
WO2017190339A1
WO2017190339A1 PCT/CN2016/081229 CN2016081229W WO2017190339A1 WO 2017190339 A1 WO2017190339 A1 WO 2017190339A1 CN 2016081229 W CN2016081229 W CN 2016081229W WO 2017190339 A1 WO2017190339 A1 WO 2017190339A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
layer
vnf
information
indication message
Prior art date
Application number
PCT/CN2016/081229
Other languages
English (en)
French (fr)
Inventor
王姗姗
支炳立
危彦
张友梅
季莉
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2016/081229 priority Critical patent/WO2017190339A1/zh
Publication of WO2017190339A1 publication Critical patent/WO2017190339A1/zh

Links

Images

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a fault processing method and apparatus.
  • the element management system In the network function virtualization (NFV) scenario, the element management system (EMS) uniformly schedules each layer management system to implement fault recovery. If the node fails or the virtual machine (VM) fails, the service layer detects it through the heartbeat. After detecting a fault, the service layer first performs service recovery and then performs fault repair.
  • NFV network function virtualization
  • the current fault recovery solution implements fault recovery through the management layer of the service layer, the service layer, the virtual layer, and the infrastructure layer (also referred to as the I layer). For example, if the infrastructure layer fails, the Virtualized Infrastructure Managers (VIMs) will summarize the fault information to the service layer through the virtual layer.
  • the EMS of the service layer analyzes the cause of the fault and then delivers the fault. Repair instructions. Specifically, the EMS sends the fault repair command to the VNFM, and the VNFM calls the self-healing interface of the I layer to trigger the I layer to execute the self-healing process, so that the cross-layer call interface may cause the system structure to be complicated.
  • the embodiment of the invention provides a fault processing method and device, which can avoid cross-layer call interface when performing fault processing, thereby reducing the complexity of the system.
  • a fault processing method including:
  • the virtual network function manager VNFM receives fault information sent by the virtualization infrastructure manager VIM, and the fault information is used to indicate a fault of the first object of the infrastructure layer;
  • the VNFM sends a first indication message to the VIM according to the fault information and the VNF, where the first indication message includes a first abnormality information of a virtual layer and a second virtual layer corresponding to the first abnormality information.
  • An identifier of the object, the first abnormality information is caused in the infrastructure layer Associated with the cause of the failure, the second object is associated with a target object in the infrastructure layer that caused the failure, and the second object provides a virtual resource for the VNF.
  • the first indication message is used to indicate that the VIM determines the reason according to a mapping relationship between the infrastructure layer and the virtual layer, the first abnormality information, and an identifier of the second object. And the target object and process the fault.
  • the VNFM can send the abnormality information of the virtual layer associated with the cause of the failure in the infrastructure layer to the VIM, so that the VIM can determine the reason of the infrastructure layer according to the abnormal information of the virtual layer, and execute the corresponding The fault handling process, which can avoid the cross-layer call self-healing interface triggering the self-healing process, thereby reducing the complexity of the system.
  • the sending, by the VNFM, the first indication message to the VIM according to the fault information and the VNF includes: if the VNFM is according to the fault The information and the VNF are capable of determining the first abnormality information, and transmitting the first indication message to the VIM.
  • the VNFM performs correlation analysis according to failure or abnormality information of the infrastructure layer and the virtual layer, and can determine abnormal information in the virtual layer associated with the cause of the failure in the infrastructure layer, send an indication message including the abnormality information to the VIM. At this time, it is not necessary to report the fault information to the EMS, so the faults of each layer can be decoupled, which greatly shortens the fault recovery time.
  • the sending, by the VNFM, the first indication message to the VIM according to the fault information and the VNF includes:
  • the VNFM cannot determine the first abnormality information according to the fault information and the VNF, sending the fault information and the identifier of the VNF to the network element management system EMS;
  • the VNFM receives the second indication message sent by the EMS, where the second indication message is generated by the EMS according to the fault information, the VNF, and the service abnormality information sent by the VNF to the EMS.
  • the second indication message includes a second abnormality information of the service layer and an identifier of the third object of the service layer corresponding to the second abnormality information, where the second abnormality information is associated with the cause, and the third object is associated with the third object Associated with the target object, the third object being a business object on the VNF;
  • the VNFM generates the first indication message according to the mapping relationship between the virtual layer and the service layer and the second indication message;
  • the VNFM sends the first indication message to the VIM.
  • VNFM performs correlation based on fault or abnormal information of infrastructure layer and virtual layer If the abnormal information associated with the cause of the fault in the infrastructure layer in the virtual layer cannot be determined, the fault information is reported to the EMS, and the EMS performs correlation analysis based on the fault or abnormal information of the infrastructure layer, the virtual layer, and the service layer. Then, according to the abnormal information associated with the cause of the failure in the infrastructure layer in the service layer delivered by the EMS, the corresponding processing flow is executed, and the fault of the infrastructure layer can be solved, and the fault recovery can be realized.
  • a fault processing method including:
  • the virtualized infrastructure manager VIM determines that the first object of the infrastructure layer has failed
  • the virtual The network function manager VNFM sends fault information, and the fault information is used to indicate a fault that occurs in the first object;
  • the VIM receives the first indication message sent by the VNFM, where the first indication message includes a first abnormality information of the virtual layer and an identifier of the second object of the virtual layer corresponding to the first abnormality information;
  • the VIM can perform a second self-healing process to process the fault after knowing the target object and the cause of the fault.
  • the VIM when the VIM cannot solve the fault, the VIM sends the fault information to the VNFM, so that the correlation analysis can be performed by the VNFM or the EMS, so that the cause of the fault can be determined, and the corresponding self-healing process is performed to implement the fault recovery.
  • the first indication message is that the VNFM is generated according to the fault information and a virtual network function VNF corresponding to the first object.
  • the first indication message is that the VNFM is received according to a mapping relationship between the virtual layer and a service layer, and received from a network element management system (EMS)
  • EMS network element management system
  • the second indication message is that the EMS receives, according to the fault information received from the VNFM, a VNF corresponding to the first object, and a service received from the VNF.
  • the second indication message includes a second abnormality information of the service layer and an identifier of the third object of the service layer corresponding to the second abnormality information, where The second abnormality information is associated with the cause, the third object is associated with the target object, and the third object is a business object on the VNF.
  • a fault processing method including:
  • the network element management system EMS receives the fault information sent by the virtual network function manager VNFM and the identifier of the virtual network function VNF, where the fault information is used to indicate a fault of the first object of the infrastructure layer, the VNF and the first Associated with objects;
  • the EMS sends a second indication message to the VNFM according to the fault information, the VNF, and the service abnormality information, where the second indication message includes a second abnormality information of the service layer and the second abnormality information.
  • the second abnormality information is associated with a cause of the fault in the infrastructure layer, and the third object and the infrastructure layer cause the A target object of the failure is associated, the third object being a business object on the VNF.
  • the second indication message is used to instruct the VNFM to generate a first indication message for processing the fault according to a mapping relationship between the service layer and the virtual layer and the second indication message, and send To the virtualized infrastructure manager VIM, the first indication message includes first abnormality information of the virtual layer and an identifier of a second object of the virtual layer corresponding to the first abnormality information.
  • the EMS performs correlation analysis based on the fault or abnormality information of the infrastructure layer, the virtual layer, and the service layer, and can determine the abnormality information of the service layer associated with the cause of the failure in the infrastructure layer, and then the abnormality of the business layer.
  • the information is sent to the VNFM, and the corresponding processing flow is executed by the VNFM, which can solve the fault of the infrastructure layer and realize the fault recovery.
  • a system for fault handling comprising:
  • VNFM virtual network function manager
  • VIM virtualization infrastructure manager
  • the network element management system EMS is configured to perform the fault processing method as described in the third aspect.
  • a virtual network function manager VNFM is provided, the VNFM being configured to perform the fault processing method according to any one of the foregoing possible implementation manners of the first aspect or the first aspect.
  • the VNFM includes:
  • a receiving unit configured to receive fault information sent by the virtualization infrastructure manager VIM, where the fault information is used to indicate a fault that occurs in the first object of the infrastructure layer;
  • a determining unit configured to determine a virtual network function VNF corresponding to the first object
  • a sending unit configured to send a first indication message to the VIM according to the fault information received by the receiving unit and the VNF determined by the determining unit, where the first indication message includes a first abnormality of a virtual layer Information identifying an identifier of the second object of the virtual layer corresponding to the first abnormality information, the first abnormality information being associated with a cause of the fault in the infrastructure layer, the second object and the base A target object in the facility layer that causes the failure, the second object providing a virtual resource for the VNF.
  • the first indication message is used to indicate that the VIM determines the reason according to a mapping relationship between the infrastructure layer and the virtual layer, the first abnormality information, and an identifier of the second object. And the target object and process the fault.
  • the determining unit is further configured to: determine, according to the fault information and the VNF, whether the first abnormality information can be determined; The unit is specifically configured to: if the determining unit is capable of determining the first abnormality information, send the first indication message to the VIM.
  • the determining unit is further configured to: determine, according to the fault information and the VNF, whether the first abnormality information can be determined;
  • the sending unit is specifically configured to: if the determining unit is unable to determine the first abnormality information, send the fault information and the identifier of the VNF to the network element management system EMS;
  • the receiving unit is further configured to receive a second indication message sent by the EMS, where the second indication message is an abnormality that is sent by the EMS to the EMS according to the fault information, the VNF, and the VNF.
  • the second indication message includes the second abnormality information of the service layer and the identifier of the third object of the service layer corresponding to the second abnormality information, where the second abnormality information is associated with the cause, where The third object is associated with the target object, and the third object is a business object on the VNF;
  • the determining unit is further configured to generate the first indication message according to the mapping relationship between the virtual layer and the service layer and the second indication message received by the receiving unit;
  • the sending unit is specifically configured to send the first indication message to the VIM.
  • a virtualization infrastructure manager VIM is provided, the VIM being used to perform The fault processing method according to any one of the foregoing possible implementation manners of the second aspect or the second aspect.
  • the VIM includes:
  • a determining unit configured to determine that the first object of the infrastructure layer is faulty
  • the determining unit is further configured to analyze whether it is possible to determine a cause of the fault
  • a sending unit configured to: if the determining unit is unable to determine the cause of the fault, or the determining unit fails to solve the fault after performing the first self-healing process according to the determined first cause causing the fault,
  • the virtual network function manager VNFM sends fault information, and the fault information is used to indicate a fault that occurs in the first object;
  • a receiving unit configured to receive a first indication message that is sent by the VNFM, where the first indication message includes a first abnormality information of a virtual layer and an identifier of a second object of the virtual layer corresponding to the first abnormality information;
  • the determining unit is further configured to determine the cause and the infrastructure layer according to a mapping relationship between the infrastructure layer and the virtual layer, the first abnormality information, and an identifier of the second object.
  • the target object that caused the failure is further configured to determine the cause and the infrastructure layer according to a mapping relationship between the infrastructure layer and the virtual layer, the first abnormality information, and an identifier of the second object. The target object that caused the failure.
  • the VIM can perform a second self-healing process to process the fault after knowing the target object and the cause of the fault.
  • the first indication message is that the VNFM is generated according to the fault information and a virtual network function VNF corresponding to the first object.
  • the first indication message is that the VNFM receives the mapping relationship between the virtual layer and the service layer, and receives the information from the network element management system EMS.
  • generating, by the second indication message, the second indication message is that the EMS receives, according to the fault information received from the VNFM, a VNF corresponding to the first object, and a service received from the VNF.
  • the second indication message includes a second abnormality information of the service layer and an identifier of the third object of the service layer corresponding to the second abnormality information, where the second abnormality information is related to the cause
  • the third object is associated with the target object, and the third object is a business object on the VNF.
  • a network element management system (EMS) is provided, where the network element management system (EMS) is configured to perform the fault processing method described in the third aspect.
  • the network element management system EMS includes:
  • a receiving unit configured to receive fault information and virtual information sent by the virtual network function manager VNFM An identifier of a network function VNF, the fault information being used to indicate a failure of a first object of an infrastructure layer, the VNF being associated with the first object;
  • the receiving unit is further configured to receive service abnormality information sent by the VNF;
  • a sending unit configured to send a second indication message to the VNFM according to the fault information, the VNF, and the service abnormality information received by the receiving unit, where the second indication message includes a second layer of the service layer An abnormality information and an identifier of the third object of the service layer corresponding to the second abnormality information, where the second abnormality information is associated with a cause of the fault in the infrastructure layer, the third object and A target object in the infrastructure layer that causes the failure, the third object being a business object on the VNF.
  • the second indication message is used to instruct the VNFM to generate a first indication message for processing the fault according to a mapping relationship between the service layer and the virtual layer and the second indication message, and send To the virtualized infrastructure manager VIM, the first indication message includes first abnormality information of the virtual layer and an identifier of a second object of the virtual layer corresponding to the first abnormality information.
  • a VNFM comprising: a processor, a memory, and a bus system.
  • the processor and the storage are coupled by the bus system, the memory is for storing instructions, the processor is configured to execute the memory stored instructions, such that the VNFM performs the first aspect or the first aspect of the first aspect A fault handling method as described in any of the possible implementations.
  • a VIM comprising: a processor, a memory, and a bus system.
  • the processor and the storage are coupled by the bus system, the memory is for storing instructions, the processor is configured to execute the memory stored instructions, such that the VIM performs the second aspect or the second aspect of the second aspect A fault handling method as described in any of the possible implementations.
  • an EMS comprising: a processor, a memory, and a bus system.
  • the processor and the storage are connected by the bus system, the memory is for storing instructions, and the processor is configured to execute the memory stored instructions, so that the EMS performs the fault processing method according to the third aspect .
  • a computer readable storage medium storing a program, the program being executed to cause the VNFM to perform the first aspect or any of the above possible implementations of the first aspect The fault handling method described.
  • a twelfth aspect a computer readable storage medium storing a program, the program being executed to cause the VIM to perform the above aspect of the second aspect or the second aspect A possible implementation of the fault handling method described.
  • a computer readable storage medium in a thirteenth aspect, storing a program, the program being executed such that the EMS performs the fault processing method of the third aspect.
  • FIG. 1 is a schematic structural diagram of a network function virtualization system to which a fault processing method according to an embodiment of the present invention is applied;
  • FIG. 2 is a schematic flow chart of a conventional fault processing method
  • FIG. 3 is a schematic flowchart of a fault processing method according to an embodiment of the present invention.
  • VNFM 4 is a schematic block diagram of a VNFM according to an embodiment of the present invention.
  • FIG. 5 is a schematic block diagram of a VNFM according to another embodiment of the present invention.
  • FIG. 6 is a schematic block diagram of a VIM according to an embodiment of the present invention.
  • FIG. 7 is a schematic block diagram of a VIM according to another embodiment of the present invention.
  • FIG. 8 is a schematic block diagram of an EMS according to an embodiment of the present invention.
  • FIG. 9 is a schematic block diagram of an EMS in accordance with another embodiment of the present invention.
  • FIG. 1 shows a schematic architectural diagram of an NFV system 100 to which a fault handling method of an embodiment of the present invention is applied, which may be implemented by various networks, such as a data center network, a service provider network, or a local area network (Local Area Network) , LAN).
  • the NFV system 100 can include:
  • NFV Infrastructure (NFVI) 130
  • VNF Virtual Network Function
  • EMS Multiple Element Management System
  • One or more Operation Support System/Business Support System (OSS/BSS) 124 One or more Operation Support System/Business Support System (OSS/BSS) 124.
  • OSS/BSS Operation Support System/Business Support System
  • the MANO 128 may include an Orchestrator 102, one or more VNF Managers (VNFMs) 104, and one or more Virtualized Infrastructure Managers (VIMs) 106.
  • VNFMs VNF Managers
  • VIPs Virtualized Infrastructure Managers
  • the NFVI 130 may include a computing hardware 112, storage hardware 114, a hardware resource layer composed of network hardware 116, a virtualization layer, and a virtual resource layer composed of virtual computing 110 (eg, virtual machine), virtual storage 118, and virtual network 120.
  • the computing hardware 112 can be a dedicated processor or a general purpose processor for providing processing and computing functions.
  • the storage hardware 114 is configured to provide storage capabilities, which may be provided by the storage hardware 114 itself (eg, a server's local memory), or may be provided over a network (eg, the server connects to a network storage device over a network).
  • Network hardware 116 may be a switch, router, and/or other network device, and network hardware 116 is used to enable communication between multiple devices, with wireless or wired connections between multiple devices.
  • the virtualization layer in NFVI130 is used to abstract the hardware resources of the hardware resource layer, decouple the VNF 108 from the physical layer to which the hardware resources belong, and provide virtual resources to the VNF.
  • virtual resources may include virtual computing 110, virtual storage 118, and virtual network 120.
  • Virtual computing 110, virtual storage 118 may provide virtual resources to VNF 108 in the form of virtual machines or other virtual containers.
  • the virtualization layer forms a virtual network 120 through abstract network hardware 116.
  • a virtual network 120 such as a virtual switch (eg, Vswitches), is used to enable communication between multiple virtual machines, or between other types of virtual containers hosting VNFs.
  • Virtualization of network hardware can be virtualized by virtual LAN (VLAN, Virtual LAN), Virtual Private LAN Service (VPLS), Virtual Extensible Local Area Network (VxLAN), or general routing encapsulation network ( NVGRE, Network Virtualization using Generic Routing Encapsulation) and other technical implementations.
  • OSS/BSS124 is mainly for telecom operators, providing comprehensive network management and service operation functions, including network management (such as fault monitoring, network information collection, etc.), billing management, and customer service management.
  • network management such as fault monitoring, network information collection, etc.
  • billing management such as billing management
  • customer service management such as customer service management.
  • the Service VNF and Infrastructure Description System 126 is described in detail in the ETSI GS NFV 002v1.1.1 standard, and details are not described herein again.
  • MANO 128 can be used to monitor and manage VNF 108 and NFVI 130.
  • the compiler 102 can communicate with one or more VNFMs 104 to implement resource related requests, send configuration information to the VNFM 104, and collect status information for the VNF 108.
  • orchestrator 102 can also communicate with virtualization infrastructure manager 106 to enable resource allocation, and/or to implement provisioning and exchange of configuration information and status information for virtualized hardware resources.
  • the VNFM 104 can be used to manage one or more VNFs 108, performing various management functions, such as initializing, updating, querying, and/or terminating the VNF 108.
  • the VIM 106 can be used to control and manage the interaction of the VNF 108 and computing hardware 112, storage hardware 114, network hardware 116, virtual computing 110, virtual storage 118, virtual network 120.
  • VIM 106 can be used to perform resource allocation operations to VNF 108.
  • VNFM 104 and VIM 106 can communicate with each other to exchange virtualized hardware resource configuration and status information.
  • NFVI 130 includes both hardware and software that together create a virtualized environment to deploy, manage, and execute VNF 108.
  • the hardware resource layer and the virtual resource layer are used to provide virtual resources, such as virtual machines and/or other forms of virtual containers, to the VNF 108.
  • VNFM 104 can communicate with VNF 108 and EMS 122 to perform VNF lifecycle management and implement exchange of configuration/status information.
  • the VNF 108 is a virtualization of at least one network function that was previously provided by a physical network device.
  • the VNF 108 can be a virtualized Mobility Management Entity (MME) node that provides all of the network functions provided by a typical non-virtualized MME device.
  • MME Mobility Management Entity
  • the VNF 108 can be used to implement the functionality of some of the components provided on the non-virtualized MME device.
  • One or more VNFs 108 can be deployed on a virtual machine (or other form of virtual container).
  • the EMS 122 can be used to manage one or more VNFs.
  • an infrastructure layer also referred to as an I layer
  • a virtual layer and a service layer are mainly involved
  • the infrastructure layer includes NFVI and VIM
  • the virtual layer includes a VNFM and virtual resources in the VNFM jurisdiction domain (virtual Computing, virtual storage, and virtual networking)
  • the business layer includes VNF and EMS.
  • the virtual resources in the infrastructure layer include virtual resources in respective jurisdictions of the VNFM, that is, the virtual resources in one VNFM jurisdiction are part of the virtual resources in the infrastructure layer.
  • the VNFM can be used to manage the lifecycle management of one or more VNFs deployed on virtual resources within its jurisdiction, and the EMS is used to manage business objects on the VNF.
  • failures caused by the infrastructure layer can be passed to the virtual layer and the business layer.
  • the fault of the virtual layer and the fault of the service layer may be caused; the fault caused by the cause of the virtual layer may also be transmitted to the service layer, thereby causing the fault of the service layer.
  • the existing fault handling process will be described below by taking the failure occurring at the VM of the infrastructure layer as an example in conjunction with FIG. 2 .
  • the VNF finds that the VM is abnormal. For example, if the resource is abnormal, the service abnormality information is reported to the EMS.
  • the VIM detects that the VM is faulty, for example, the resource on the VM cannot be called.
  • the VIM can detect that the VM has failed by using heartbeat detection.
  • the VIM feeds back the fault information of the VM to the VNFM.
  • the VNFM feeds back the fault and the identifier of the VNF corresponding to the VM to the EMS.
  • the EMS performs anomaly analysis according to the information fed back by the VNFM and the service abnormality information reported by the VNF.
  • the EMS sends a corresponding HA instruction to the VNFM according to the abnormal analysis result, for example, restarting the VM.
  • the VNFM invokes the self-healing interface of the VIM to send the HA instruction to the VM.
  • the VM executes the HA instruction.
  • the fault information is reported to the EMS of the service layer through the VNFM, and the EMS makes a decision.
  • the processing delay of the fault repair becomes longer.
  • the upper-layer network element of the service layer invokes the interface of the lower-layer network element to trigger the self-healing process, so that calling each other may result in a complicated structure.
  • the failure of the infrastructure layer includes a hardware resource failure (such as a host) or a virtual resource (such as a virtual machine) failure. If VIM is sufficient to analyze the cause of the failure of the infrastructure layer, the decision is made by the VIM; if the VIM cannot analyze the cause, the fault information is reported to the VNFM of the virtual layer; likewise, if the VNFM can analyze the fault that caused the fault. The reason is determined by the VNFM. If the VNFM cannot analyze the cause, the fault information and the identifier of the VNF corresponding to the fault information are reported to the EMS.
  • a hardware resource failure such as a host
  • a virtual resource such as a virtual machine
  • the embodiment of the invention provides a fault processing method, which can decouple the fault recovery of each layer, and the faults that can be processed can be solved by each layer, which greatly shortens the fault recovery time and realizes the fast and short path closed loop of the fault.
  • the EMS is responsible for the fault that the service layer can solve
  • VNFM responsible for the faults that the decision virtualization layer can solve
  • VIM is responsible for the faults that the infrastructure layer can solve.
  • the upper layer network element does not directly invoke the interface of the lower layer network element, but uses the universal interface to send the abnormal information associated with the cause of the fault in the infrastructure layer to the lower layer network element.
  • the lower layer network element decides how to handle the fault according to the abnormal information.
  • FIG. 3 is a schematic flow chart of a fault processing method according to an embodiment of the present invention. As shown in FIG. 3, the fault processing method 300 includes the following.
  • the service layer and the infrastructure layer detect the corresponding fault, and the corresponding module reports it to perform correlation analysis to confirm the root cause of the fault and trigger the layered fault self-healing operation.
  • the specific process is as follows:
  • the first object fails.
  • the first object may be a resource object in the infrastructure layer for providing virtual resources to the upper layer service, such as a VM, a virtual network card (vNIC), a host (HOST), or a single board object (BOARD).
  • a resource object in the infrastructure layer for providing virtual resources to the upper layer service, such as a VM, a virtual network card (vNIC), a host (HOST), or a single board object (BOARD).
  • VM virtual network card
  • HOST host
  • BOARD single board object
  • the VNF detects the service abnormality information that is run on the first object, and reports the service abnormality information of the first object to the EMS.
  • the service abnormal information may include service layer occlusion, service interruption, service indicator hopping, service indicator sag, and large traffic impact.
  • the VNF triggers service recovery according to the configured service layer self-healing policy.
  • the standby object takes over the service of the first object that is faulty.
  • the service flow redistribution action is performed to allocate the traffic of the faulty first object to other normals. On the object.
  • the VIM detects that the first object is faulty.
  • the infrastructure layer detects the first object failure and reports the failure of the first object to the VIM.
  • steps 302 and 303 is not limited, and the order of steps 302 and 303 can be determined by the detection speeds of VIM and VNF.
  • the VIM analyzes the cause of the failure. If it is possible to determine the cause of the failure in the infrastructure layer, execute 305; if the cause of the failure cannot be determined, execute 306.
  • the VIM performs a self-healing process corresponding to the cause, and determines whether the fault has been resolved. If the fault has been resolved, the process ends; if the fault is not resolved, then step 306 is performed.
  • a default self-healing process for different reasons can be pre-configured in the VIM.
  • the default The self-healing process can be a specified self-healing process, such as a soft restart or a hard restart, or several self-healing processes with prioritization. For example, if the VIM detects that the first object has failed, the default execution is soft. Restart, if the fault cannot be resolved, the hard restart will continue by default. If the fault cannot be resolved, the local reconstruction will continue by default.
  • the EMS may trigger the VIM to perform a self-healing process for the fault of the first object according to the service abnormality information reported by the VNF. In this way, VIM may repeat the self-healing process to handle the failure.
  • the VIM may further determine whether the fault has been resolved before step 305: if the fault is not resolved, perform step 305; if the fault is resolved , the process ends.
  • the VIM can maintain the history information of the self-healing operation, and each self-healing process saves a record.
  • the record may include information such as an operation object identifier, a self-healing operation, an operation time, and a self-healing result.
  • the self-healing result can be determined by detecting whether the corresponding alarm has been recovered.
  • the VIM can determine whether the fault has been resolved according to whether the self-healing operation record related to the corresponding object is recorded. For example, the VIM may determine whether the failure of the first object has been resolved based on the identity of the first object.
  • the VIM can analyze the cause of the failure, the failure can be solved after the corresponding self-healing process is performed. If the VIM cannot analyze the cause of the fault, the VIM reports the fault information to the VNFM. It should be noted that in another embodiment, there may be a case where the VIM is misjudged, that is, the VIM analyzes the cause of the fault, but after performing the self-healing process corresponding to the cause, the fault is not resolved, and the VIM is Continue to report the fault information to the VNFM.
  • the VIM reports the fault information to the VNFM, where the fault information is used to indicate that the first object is faulty.
  • the VNFM determines, according to the mapping relationship between the VNF and the VM, the VNF associated with the first object, that is, the VNF deployed on the first object.
  • the VNFM performs fault correlation analysis between the infrastructure layer and the virtual layer according to the fault information reported by the VIM and the VNF. If the VNFM can analyze the abnormal information associated with the cause of the fault in the virtual layer, perform 311; if VNFM If the exception information associated with the cause of the failure in the virtual layer cannot be analyzed, then execution 309 is performed.
  • the VNFM analyzes that the abnormal information associated with the cause of the failure of the first object in the virtual layer may be: GuestOS memory leak, high CPU usage, network port failure, or switch. Barriers and so on.
  • the VNFM also determines a virtual layer object involved in the exception information, the virtual layer object being associated with the target object in the first object that caused the failure.
  • the association means that the abnormal information of the virtual layer object is related to the fault of the target object, that is, the failure of the target object in the infrastructure layer may cause the virtual layer object to malfunction or be abnormal.
  • failure of the target object in the infrastructure layer can also cause business object failures in the business layer.
  • the VNFM reports the fault information to the EMS and the identifier of the VNF determined by the VNFM in step 307.
  • the EMS performs correlation analysis on faults or abnormalities of the service layer, the virtual layer, and the infrastructure layer according to the fault reported by the VNFM and the VNF indicated by the VNF and the service abnormality information reported by the VNF, and determines the abnormal information of the service layer. And sending a second indication message to the VNFM, where the second indication message includes the abnormality information of the service layer and the identifier of the third object of the service layer corresponding to the abnormality information.
  • the abnormality information of the business layer is associated with the cause of the failure in the infrastructure layer
  • the third object is associated with the target object in the infrastructure layer that causes the failure
  • the third object is the business object on the VNF.
  • the EMS can determine that the abnormal information of the business layer is a service or the like that does not have sufficient resources to support the request.
  • the second indication message may further include an identifier of the VNF.
  • the identification of the VNF can be used by the VNFM to locate the fault.
  • the VNFM sends a first indication message to the VIM, where the first indication message includes an abnormality information of the virtual layer and an identifier of the second object of the virtual layer corresponding to the abnormality information, where the abnormality information and the cause of the fault in the infrastructure layer are caused.
  • the second object is associated with a target object in the infrastructure layer that caused the failure, wherein the second object provides a virtual resource for the VNF determined by the VNFM.
  • the abnormality information may include an identifier of the abnormality information (ID) and a time when the abnormality information is generated.
  • ID an identifier of the abnormality information
  • the first indication message may be generated by the VNFM according to the failure information sent by the VIM in step 306 and the VNF determined by the VNFM in step 307.
  • the first indication message may also be generated by the VNFM according to the mapping relationship between the virtual layer and the service layer and the second indication message sent by the EMS in step 310. Specifically, the VNFM determines the abnormal information of the virtual layer according to the abnormal information of the service layer, and determines the second object of the virtual layer according to the third object of the service layer.
  • the lower layer can only perceive the direct upper layer.
  • the VIM can only perceive the related objects of the virtual layer where the VNFM is located, but cannot sense the related objects of the service layer where the EMS is located. Therefore, the indication between the EMS and the VNFM is The message is different from the indication message between the VNFM and the VIM.
  • the first indication message may further include an identifier of the first object and a tenant_id associated with the first object.
  • the identifier of the first object and the tenant_id associated with the first object are used for VIM positioning failure.
  • the VNFM may further determine whether the fault has been resolved before step 311; if the fault is not resolved, perform step 311. If the fault has been resolved, the process ends.
  • the VNFM can maintain history information of self-healing operations of multiple objects (such as VMs) within its jurisdiction, and each self-healing process saves a record, for example, the record can include the identification of the operation object, self-healing Information such as operation, operation time, and self-healing results.
  • the self-healing result can be determined by detecting whether the corresponding alarm has been recovered.
  • the VNFM can determine whether the fault has been resolved according to whether the self-healing operation record related to the corresponding object is recorded. For example, the VNFM can determine whether the fault has been resolved based on the identity of the first object.
  • the VIM determines, according to the abnormal information of the virtual layer in the received first indication message, the cause of the fault in the infrastructure layer, and determines the target object in the infrastructure layer that causes the fault according to the identifier of the second object, and according to The cause of the failure is to perform a self-healing process for the target object to handle the failure.
  • the target object may be a unit or a module in the first object.
  • the target object can be a virtual port, a virtual network, or the like in the VM.
  • the embodiment of the present invention is not limited thereto, and the target object may also be the first object itself.
  • the VIM may determine whether the fault has been resolved; if the fault has not been resolved, proceed to step 312; If the fault has been resolved, the process ends. If the VIM determines whether the fault has been resolved, the method described above can be used, and details are not described herein again.
  • the protocol for carrying the indication message is not limited in the embodiment of the present invention.
  • the indication message may be carried according to a Hyper Text Transfer Protocol (HTTP) protocol, and may also be carried based on other protocols.
  • HTTP Hyper Text Transfer Protocol
  • the indication message sent by the EMS to the VNFM and the indication message sent by the VNFM to the VIM in the embodiment of the present invention may be referred to as an SLA Complain message.
  • the indication message according to an embodiment of the present invention is described below by taking SLA Complain as an example.
  • the SLA Complain message is based on the Hyper Text Transfer Protocol (HTTP) protocol and uses a Representational State Transfer (REST) mechanism.
  • HTTP Hyper Text Transfer Protocol
  • REST Representational State Transfer
  • the indication message between the EMS and the VNFM is different from the request format of the indication message between the VNFM and the VIM.
  • the request format of the SLA Complain message between the EMS and the VNFM is shown in Table 1 below.
  • the request parameter format of the SLA Complain message between the EMS and the VNFM is shown in Table 2.
  • the vapp instance in Table 2 refers to the VNF associated with the failed object in the infrastructure layer.
  • the service layer object corresponding to the abnormal information refers to the service on the VNF that the EMS can perceive.
  • the VNFM After receiving the SLA Complain message sent by the EMS, the VNFM can determine the abnormal information of the virtual layer and the corresponding virtual layer object according to the mapping relationship between the service layer and the virtual layer and the SLA Complain message.
  • the request format of the SLA Complain message between the VNFM and the VIM is shown in Table 3.
  • the tenant's Universally Unique Identifier is used for resource partitioning.
  • the data center (Data Center, DC for short) of the infrastructure layer may divide resources into multiple domains according to geographical regions, and each domain is divided into multiple tenants, and each tenant includes a set of logical resources ( Computing / Storage / Network).
  • the virtual layer object involved in causing the failure is associated with the target object causing the failure in the first object of the infrastructure layer.
  • the VIM may locate the failed object (such as the first object) in the infrastructure layer according to the UUID of the tenant and the UUID of the virtual machine, and then according to the virtual layer and the infrastructure layer. Mapping relationships to determine the cause of the failure and the target object in the infrastructure layer.
  • SLA Complain messages communicate in JSON mode.
  • the request implementation of the SLA Complain message between the EMS and the VNFM is as follows:
  • the SLA Complain message request implementation between VNFM and VIM is as follows:
  • VNFM 400 includes:
  • the receiving unit 410 is configured to receive fault information sent by the VIM, where the fault information is used to indicate that the first object of the infrastructure layer is faulty;
  • a determining unit 420 configured to determine a VNF associated with the first object
  • the sending unit 430 is configured to send a first indication message to the VIM according to the fault information received by the receiving unit 410 and the VNF determined by the determining unit 420, where the first indication message includes the first abnormality information of the virtual layer and the virtual corresponding to the first abnormal information.
  • the identification of the second object of the layer, the first abnormality information is associated with a cause of the failure in the infrastructure layer, the second object is associated with the target object causing the failure in the infrastructure layer, and the second object provides the virtual resource for the VNF.
  • the first indication message is used to instruct the VIM to determine the cause of the failure in the infrastructure layer and the target object according to the mapping relationship between the infrastructure layer and the virtual layer, the first abnormality information, and the identifier of the second object, and process the fault.
  • the sending unit 430 sends a first indication message to the VIM through a universal interface.
  • the VNFM sends the abnormal information of the virtual layer associated with the cause of the fault in the infrastructure layer to the VIM, so that the VIM can determine the cause of the infrastructure layer according to the abnormal information, and execute the corresponding fault handling process.
  • This can avoid the cross-layer call self-healing interface triggering the self-healing process, which can reduce the complexity of the system.
  • the determining unit 420 is further configured to analyze whether the first abnormality information can be determined according to the fault information and the VNF.
  • the sending unit 430 is specifically configured to: if the determining unit 420 can determine the first abnormality information, send the first indication message to the VIM.
  • the sending unit 430 is specifically configured to: if the determining unit 420 cannot determine the first abnormality information according to the fault information and the VNF, send the fault information and the identifier of the VNF to the network element management system EMS.
  • the receiving unit 410 is further configured to receive a second indication message sent by the EMS, where the second The indication message includes the second abnormality information of the service layer and the identifier of the third object of the service layer corresponding to the second abnormality information, where the second abnormality information is associated with the cause of the failure in the infrastructure layer, and the third object and the infrastructure layer
  • the third object is a service object on the VNF
  • the third object is a service object on the VNF.
  • the determining unit 420 is further configured to generate a first indication according to the mapping relationship between the virtual layer and the service layer and the second indication message received by the receiving unit.
  • the sending unit 430 is specifically configured to send the first indication message to the VIM.
  • the receiving unit 410 may be implemented by a receiver
  • the sending unit 430 may be implemented by a transmitter
  • the determining unit 420 may be implemented by a processor.
  • a VNFM in accordance with an embodiment of the present invention may include a processor 510, a memory 520, a receiver 530, a transmitter 540, and a bus system 550.
  • the memory 520 can be used to store instructions or code or the like executed by the processor 510.
  • a bus system 530 which may include, in addition to the data bus, a power bus, a control bus, and a status signal bus.
  • the VNFM 400 shown in FIG. 4 or the VNFM 500 shown in FIG. 5 can implement the various processes implemented by the VNFM in the foregoing method embodiments. To avoid repetition, details are not described herein again.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the foregoing method embodiment may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the processor may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a Field Programmable Gate Array (FPGA), or the like. Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory, and the processor reads the information in the memory and combines the hardware to complete the steps of the above method.
  • the memory in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • non-volatile memory can Read-Only Memory (ROM), Programmable Read ROM (PROM), Erasable PROM (EPROM), EEPROM (Electrically Erasable Programmable Read Only Memory) Electrically EPROM, EEPROM) or flash memory.
  • ROM Read-Only Memory
  • PROM Programmable Read ROM
  • EPROM Erasable PROM
  • EEPROM Electrically Erasable Programmable Read Only Memory
  • flash memory Electrically EPROM
  • the volatile memory can be a Random Access Memory (RAM) that acts as an external cache.
  • RAM Random Access Memory
  • many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (Synchronous DRAM).
  • SDRAM Double Data Rate SDRAM
  • DDR SDRAM Double Data Rate SDRAM
  • ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM Synchronous Connection Dynamic Random Access Memory
  • DR RAM direct memory bus random access memory
  • FIG. 6 is a schematic block diagram of a VIM 600 in accordance with another embodiment of the present invention.
  • VIM 600 includes:
  • a determining unit 610 configured to determine that the first object of the infrastructure layer fails
  • the determining unit 610 is further configured to analyze whether it is possible to determine the cause of the fault
  • the sending unit 620 is configured to: if the determining unit 610 is unable to determine the cause of the fault, or the determining unit 610 fails to resolve the fault after performing the first self-healing process according to the determined first cause of the fault, sending the fault information to the VNFM, the fault information Used to indicate the failure of the first object;
  • the receiving unit 630 is configured to receive the first indication message that is sent by the VNFM, where the first indication message includes the first abnormality information of the virtual layer and the identifier of the second object of the virtual layer corresponding to the first abnormality information;
  • the determining unit 610 is further configured to determine a cause of the fault and a target object in the infrastructure layer according to the mapping relationship between the infrastructure layer and the virtual layer, the first abnormality information, and the identifier of the second object.
  • VIM 600 can perform the second self-healing process to deal with the fault after knowing the target object and the cause of the fault.
  • the VIM when the VIM cannot solve the fault, the VIM sends the fault information to the VNFM, so that the correlation analysis can be performed by the VNFM or the EMS, so that the cause of the fault can be determined, and the corresponding self-healing process is implemented to implement the fault recovery. .
  • the first indication message is generated by the VNFM according to the fault information and the virtual network function VNF associated with the first object.
  • the first indication message is generated by the VNFM according to the mapping relationship between the virtual layer and the service layer and the second indication message received from the EMS, and the second indication message is that the EMS receives the fault information and the information received from the VNFM.
  • the second indication message includes the second abnormality information of the service layer and the identifier of the third object of the service layer corresponding to the second abnormality information, and the second abnormality information Associated with the cause of the failure in the infrastructure layer, the third object is associated with the target object in the infrastructure layer that caused the failure, and the third object is the business object on the VNF.
  • the determining unit 610 may be implemented by a processor
  • the sending unit 620 may be implemented by a transmitter
  • the receiving unit 630 may be implemented by a receiver.
  • a VIM in accordance with an embodiment of the present invention may include a processor 710, a memory 720, a receiver 730, a transmitter 740, and a bus system 750.
  • the memory 720 can be used to store instructions or code or the like executed by the processor 710.
  • a bus system 750 which may include, in addition to the data bus, a power bus, a control bus, and a status signal bus.
  • the VIM 600 shown in FIG. 6 or the VIM 700 shown in FIG. 7 can implement the various processes implemented by the VIM in the foregoing method embodiments. To avoid repetition, details are not described herein again.
  • the above described method embodiments of the present invention may be applied to a processor or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the foregoing method embodiment may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the above described processor may be a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory, and the processor reads the information in the memory and combines the hardware to complete the steps of the above method.
  • the memory in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be a ROM, a PROM, an EPROM, an EEPROM, or a flash memory.
  • Volatile memory can be RAM. It is used as an external cache.
  • many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDR SDRAM, ESDRAM, SLDRAM, and DR RAM. It should be noted that the memories of the systems and methods described herein are intended to comprise, without being limited to, these and any other suitable types of memory.
  • FIG. 8 is a schematic block diagram of an EMS 800 in accordance with another embodiment of the present invention.
  • EMS 800 includes:
  • the receiving unit 810 is configured to receive the fault information sent by the VNFM and the identifier of the VNF, where the fault information is used to indicate that the first object of the infrastructure layer is faulty, and the VNF is associated with the first object;
  • the receiving unit 810 is further configured to receive service abnormality information sent by the VNF.
  • the sending unit 820 is configured to send, by the receiving unit 810, the fault information, the VNF, and the service abnormality information, to the VNFM, where the second indication message includes the second abnormality information of the service layer and the service corresponding to the second abnormal information.
  • the identification of the third object of the layer, the second abnormality information is associated with the cause of the failure in the infrastructure layer, the third object is associated with the target object causing the failure in the infrastructure layer, and the third object is the business object on the VNF.
  • the second indication message is used to instruct the VNFM to generate a first indication message for processing the fault according to the mapping relationship between the service layer and the virtual layer and the second indication message, and send the information to the virtualization infrastructure manager VIM, first.
  • the indication message includes the first exception information of the virtual layer and the identifier of the second object of the virtual layer corresponding to the first exception information.
  • the EMS performs correlation analysis according to fault or abnormal information of the infrastructure layer, the virtual layer, and the service layer, and can determine abnormal information of the service layer associated with the cause of the fault in the infrastructure layer, and then The abnormal information of the service layer is sent to the VNFM, and the corresponding processing flow is executed by the VNFM, which can solve the fault of the infrastructure layer and realize fault recovery.
  • the fault processing apparatus 800 may further include a determining unit 830, configured to perform fault correlation analysis, determine the second abnormality information, and the third object according to the fault information received by the receiving unit 810, the identifier of the VNF, and the service abnormality information. The identity and generate a second indication message.
  • a determining unit 830 configured to perform fault correlation analysis, determine the second abnormality information, and the third object according to the fault information received by the receiving unit 810, the identifier of the VNF, and the service abnormality information. The identity and generate a second indication message.
  • the receiving unit 810 may be implemented by a receiver
  • the sending unit 820 may be implemented by a transmitter
  • the determining unit 830 may be implemented by a processor.
  • a fault handling apparatus may include a processor 910, a memory 920, a receiver 930, a transmitter 940, and a bus system 950.
  • the memory 920 can be used to store instructions or code or the like executed by the processor 910.
  • bus system 950 The various components in the fault handling device 900 are coupled together by a bus system 950, wherein In addition to the data bus, line system 950 can also include a power bus, a control bus, and a status signal bus.
  • the fault processing device 800 shown in FIG. 8 or the fault processing device 900 shown in FIG. 9 can implement the various processes implemented by the EMS in the foregoing method embodiments. To avoid repetition, details are not described herein again.
  • the above described method embodiments of the present invention may be applied to a processor or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the foregoing method embodiment may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the above described processor may be a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory, and the processor reads the information in the memory and combines the hardware to complete the steps of the above method.
  • the memory in the embodiments of the present invention may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be a ROM, a PROM, an EPROM, an EEPROM, or a flash memory.
  • the volatile memory can be RAM, which acts as an external cache.
  • many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDR SDRAM, ESDRAM, SLDRAM, and DR RAM. It should be noted that the memories of the systems and methods described herein are intended to comprise, without being limited to, these and any other suitable types of memory.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of cells is only a logical function division.
  • multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the embodiments of the present invention.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a computer.
  • computer readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage media or other magnetic storage device, or can be used for carrying or storing in the form of an instruction or data structure.
  • Any connection may suitably be a computer readable medium.
  • the software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave.
  • coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, wireless, and microwave are included in the fixing of the associated medium.
  • DSL digital subscriber line
  • coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, wireless, and microwave are included in the fixing of the associated medium.
  • Such as this Disks and discs used in the invention include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), floppy discs, and Blu-ray discs, where the discs are typically magnetically replicated while the discs are used.
  • the laser is used to optically replicate data. Combinations of the above should also be included within the scope of the computer

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明实施例提供了一种故障处理方法及装置,该方法包括:VNFM接收VIM发送的故障信息,故障信息用于指示基础设施层的第一对象发生的故障;确定第一对象关联的VNF;根据故障信息和VNF向VIM发送第一指示消息,第一指示消息包括虚拟层的第一异常信息和第一异常信息对应的虚拟层的第二对象的标识,第一异常信息与基础设施层中引起故障的原因相关联,第二对象与基础设施层中引起故障的目标对象相关联,第二对象为VNF提供虚拟资源。本发明实施例的故障处理方法,能够避免跨层调用接口,从而能够降低系统的复杂度。

Description

故障处理方法及装置 技术领域
本发明涉及通信技术领域,尤其涉及故障处理方法及装置。
背景技术
随着无线网络的云化演进,越来越多的运营商关注设备云化后的故障运维能力和系统的基本可靠性,故障快速恢复成为运营商们关注的重点能力。。
网络功能虚拟化(Network Function Virtualization,NFV)场景下,由网元管理系统(Element Management System,EMS)统一调度各层管理系统实施故障恢复。如果节点故障,或者虚拟机(Virtual Machine,VM)故障,业务层会通过心跳检测到。在检测到故障后,首先业务层会进行业务恢复,然后进行故障修复。
当前故障恢复的方案通过业务层统一调度业务层、虚拟层和基础设施层(也可称为I层)的管理系统实现故障恢复。例如,如果基础设施层发生故障,则虚拟化基础设施管理器(Virtualized Infrastructure Managers,VIM)会将故障信息通过虚拟层汇总至业务层,由业务层的EMS分析引起故障的原因,然后下发故障修复指令。具体地,EMS将故障修复指令下发至VNFM,由VNFM调用I层的自愈接口来触发I层执行自愈流程,这样跨层调用接口可能导致系统结构复杂。
发明内容
本发明实施例提供了一种故障处理方法及装置,在进行故障处理时,能够避免跨层调用接口,进而能够降低系统的复杂度。
第一方面,提供了一种故障处理方法,包括:
虚拟网络功能管理器VNFM接收虚拟化基础设施管理器VIM发送的故障信息,所述故障信息用于指示基础设施层的第一对象发生的故障;
所述VNFM确定所述第一对象关联的虚拟网络功能VNF;
所述VNFM根据所述故障信息和所述VNF向所述VIM发送第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的虚拟层的第二对象的标识,所述第一异常信息与所述基础设施层中引起 所述故障的原因相关联,所述第二对象与所述基础设施层中引起所述故障的目标对象相关联,所述第二对象为所述VNF提供虚拟资源。
可以理解,所述第一指示消息可用于指示所述VIM根据所述基础设施层与所述虚拟层之间的映射关系、所述第一异常信息和所述第二对象的标识确定所述原因和所述目标对象,并处理所述故障。
基于上述技术方案,VNFM通过将与基础设施层中引起故障的原因相关联的虚拟层的异常信息发送至VIM,使得VIM能够根据该虚拟层的异常信息确定基础设施层的原因,并执行相应的故障处理流程,这样能够避免跨层调用自愈接口触发自愈流程,从而能够降低系统的复杂度。
结合第一方面,在第一方面的第一种可能的实现方式中,所述VNFM根据所述故障信息和所述VNF向所述VIM发送第一指示消息包括:如果所述VNFM根据所述故障信息和所述VNF能够确定所述第一异常信息,则向所述VIM发送所述第一指示消息。
如果VNFM根据基础设施层和虚拟层的故障或异常信息进行相关性分析,能够确定虚拟层中与基础设施层中引起故障的原因相关联的异常信息,则向VIM发送包括该异常信息的指示消息,此时无需向EMS上报故障信息,因此能够将各层的故障解耦,极大缩短了故障恢复时间。
结合第一方面,在第一方面的第二种可能的实现方式中,所述VNFM根据所述故障信息和所述VNF向所述VIM发送第一指示消息包括:
如果所述VNFM根据所述故障信息和所述VNF无法确定所述第一异常信息,则向网元管理系统EMS发送所述故障信息和所述VNF的标识;
所述VNFM接收所述EMS发送的第二指示消息,所述第二指示消息是所述EMS根据所述故障信息、所述VNF和所述VNF向所述EMS发送的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象;
所述VNFM根据所述虚拟层与所述业务层之间的映射关系和所述第二指示消息生成所述第一指示消息;
所述VNFM向所述VIM发送所述第一指示消息。
如果VNFM根据基础设施层和虚拟层的故障或异常信息进行相关性分 析,无法确定虚拟层中与基础设施层中引起故障的原因相关联的异常信息,则向EMS上报故障信息,由EMS根据基础设施层、虚拟层和业务层的故障或异常信息进行相关性分析,然后根据EMS下发的业务层中与基础设施层中引起故障的原因相关联的异常信息执行相应的处理流程,能够解决基础设施层的故障,实现故障恢复。
第二方面,提供了一种故障处理方法,包括:
虚拟化基础设施管理器VIM确定基础设施层的第一对象发生故障;
如果所述VIM无法确定所述基础设施层中引起所述故障的原因,或者所述VIM根据确定的引起所述故障的第一原因执行第一自愈流程后无法解决所述故障,则向虚拟网络功能管理器VNFM发送故障信息,所述故障信息用于指示所述第一对象发生的故障;
所述VIM接收所述VNFM发送的第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的所述虚拟层的第二对象的标识;
所述VIM根据所述基础设施层和所述虚拟层之间的映射关系、所述第一异常信息和所述第二对象的标识,确定所述原因和所述基础设施层中引起所述故障的目标对象。
可以理解,VIM在获知目标对象和故障的原因后可以有针对性的执行第二自愈流程,处理所述故障。
基于上述技术方案,VIM在无法解决故障时,向VNFM发送故障信息,使得能够由VNFM或EMS进行相关性分析,从而能够确定引起故障的原因,并执行相应的自愈流程,实现故障恢复。
结合第二方面,在第二方面的第一种可能的实现方式中,所述第一指示消息是所述VNFM根据所述故障信息和所述第一对象对应的虚拟网络功能VNF生成的。
结合第二方面,在第二方面的第二种可能的实现方式中,所述第一指示消息是所述VNFM根据所述虚拟层与业务层之间的映射关系和从网元管理系统EMS接收到的第二指示消息生成的,所述第二指示消息是所述EMS根据从所述VNFM接收到的所述故障信息和所述第一对象对应的VNF、以及从所述VNF接收到的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识, 所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象。
第三方面,提供了一种故障处理方法,包括:
网元管理系统EMS接收虚拟网络功能管理器VNFM发送的故障信息和虚拟网络功能VNF的标识,所述故障信息用于指示基础设施层的第一对象发生的故障,所述VNF与所述第一对象相关联;
所述EMS接收所述VNF发送的业务异常信息;
所述EMS根据所述故障信息、所述VNF和所述业务异常信息,向所述VNFM发送第二指示消息,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述基础设施层中引起所述故障的原因相关联,所述第三对象与所述基础设施层中引起所述故障的目标对象相关联,所述第三对象为所述VNF上的业务对象。
可以理解,所述第二指示消息可用于指示所述VNFM根据所述业务层与虚拟层之间的映射关系和所述第二指示消息生成用于处理所述故障的第一指示消息,并发送至虚拟化基础设施管理器VIM,所述第一指示消息包括所述虚拟层的第一异常信息和所述第一异常信息对应的所述虚拟层的第二对象的标识。
EMS根据基础设施层、虚拟层和业务层的故障或异常信息进行相关性分析,能够确定与基础设施层中引起所述故障的原因相关联的业务层的异常信息,然后将该业务层的异常信息发送至VNFM,由VNFM执行相应的处理流程,能够解决基础设施层的故障,实现故障恢复。
第四方面,提供了一种用于故障处理的系统,包括:
虚拟网络功能管理器VNFM,用于执行如第一方面或第一方面的上述任一种可能的实现方式所述的故障处理方法;
虚拟化基础设施管理器VIM,用于执行如第二方面或第二方面的上述任一种可能的实现方式所述的故障处理方法;
网元管理系统EMS,用于执行如第三方面所述的故障处理方法。
第五方面,提供了一种虚拟网络功能管理器VNFM,所述VNFM用于执行如第一方面或第一方面的上述任一种可能的实现方式所述的故障处理方法。具体地,所述VNFM包括:
接收单元,用于接收虚拟化基础设施管理器VIM发送的故障信息,所述故障信息用于指示基础设施层的第一对象发生的故障;
确定单元,用于确定所述第一对象对应的虚拟网络功能VNF;
发送单元,用于根据所述接收单元接收到的所述故障信息和所述确定单元确定的所述VNF向所述VIM发送第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的虚拟层的第二对象的标识,所述第一异常信息与所述基础设施层中引起所述故障的原因相关联,所述第二对象与所述基础设施层中引起所述故障的目标对象相关联,所述第二对象为所述VNF提供虚拟资源。
可以理解,所述第一指示消息可用于指示所述VIM根据所述基础设施层与所述虚拟层之间的映射关系、所述第一异常信息和所述第二对象的标识确定所述原因和所述目标对象,并处理所述故障。
结合第五方面,在第五方面的第一种可能的实现方式中,所述确定单元还用于,根据所述故障信息和所述VNF分析是否能够确定所述第一异常信息;所述发送单元具体用于,如果所述确定单元能够确定所述第一异常信息,则向所述VIM发送所述第一指示消息。
结合第五方面,在第五方面的第一种可能的实现方式中,所述确定单元还用于,根据所述故障信息和所述VNF分析是否能够确定所述第一异常信息;
所述发送单元具体用于,如果所述确定单元无法确定所述第一异常信息,则向网元管理系统EMS发送所述故障信息和所述VNF的标识;
所述接收单元还用于,接收所述EMS发送的第二指示消息,所述第二指示消息是所述EMS根据所述故障信息、所述VNF和所述VNF向所述EMS发送的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象;
所述确定单元还用于,根据所述虚拟层与所述业务层之间的映射关系和所述接收单元接收到的所述第二指示消息生成所述第一指示消息;
所述发送单元具体用于向所述VIM发送所述第一指示消息。
第六方面,提供了一种虚拟化基础设施管理器VIM,所述VIM用于执 行第二方面或第二方面的上述任一种可能的实现方式所述的故障处理方法。具体地,所述VIM包括:
确定单元,用于确定基础设施层的第一对象发生故障;
所述确定单元还用于,分析是否能够确定引起所述故障的原因;
发送单元,用于如果所述确定单元无法确定引起所述故障的原因,或者所述确定单元根据确定的引起所述故障的第一原因执行第一自愈流程后无法解决所述故障,则向虚拟网络功能管理器VNFM发送故障信息,所述故障信息用于指示所述第一对象发生的故障;
接收单元,用于接收所述VNFM发送的第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的所述虚拟层的第二对象的标识;
所述确定单元还用于,根据所述基础设施层和所述虚拟层之间的映射关系、所述第一异常信息和所述第二对象的标识,确定所述原因和所述基础设施层中引起所述故障的目标对象。
可以理解,VIM在获知目标对象和故障的原因后可以有针对性的执行第二自愈流程,处理所述故障。
结合第六方面,在第六方面的第一种可能的实现方式中,所述第一指示消息是所述VNFM根据所述故障信息和所述第一对象对应的虚拟网络功能VNF生成的。
结合第六方面,在第六方面的第二种可能的实现方式中,所述第一指示消息是所述VNFM根据所述虚拟层与业务层之间的映射关系和从网元管理系统EMS接收到的第二指示消息生成的,所述第二指示消息是所述EMS根据从所述VNFM接收到的所述故障信息和所述第一对象对应的VNF、以及从所述VNF接收到的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象。
第七方面,提供了一种网元管理系统EMS,所述网元管理系统EMS用于执行第三方面所述的故障处理方法。具体地,所述网元管理系统EMS包括:
接收单元,用于接收虚拟网络功能管理器VNFM发送的故障信息和虚 拟网络功能VNF的标识,所述故障信息用于指示基础设施层的第一对象发生的故障,所述VNF与所述第一对象相关联;
所述接收单元还用于,接收所述VNF发送的业务异常信息;
发送单元,用于根据所述接收单元接收到的所述故障信息、所述VNF和所述业务异常信息,向所述VNFM发送第二指示消息,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述基础设施层中引起所述故障的原因相关联,所述第三对象与所述基础设施层中引起所述故障的目标对象相关联,所述第三对象为所述VNF上的业务对象。
可以理解,所述第二指示消息可用于指示所述VNFM根据所述业务层与虚拟层之间的映射关系和所述第二指示消息生成用于处理所述故障的第一指示消息,并发送至虚拟化基础设施管理器VIM,所述第一指示消息包括所述虚拟层的第一异常信息和所述第一异常信息对应的所述虚拟层的第二对象的标识。
第八方面,提供了一种VNFM,包括:处理器、存储器和总线系统。所述处理器和所述存储通过所述总线系统相连,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,使得所述VNFM执行第一方面或第一方面的上述任一种可能的实现方式所述的故障处理方法。
第九方面,提供了一种VIM,包括:处理器、存储器和总线系统。所述处理器和所述存储通过所述总线系统相连,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,使得所述VIM执行第二方面或第二方面的上述任一种可能的实现方式所述的故障处理方法。
第十方面,提供了一种EMS,包括:处理器、存储器和总线系统。所述处理器和所述存储通过所述总线系统相连,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,使得所述EMS执行第三方面所述的故障处理方法。
第十一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有程序,运行所述程序使得VNFM执行第一方面或第一方面的上述任一种可能的实现方式所述的故障处理方法。
第十二方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有程序,运行所述程序使得VIM执行第二方面或第二方面的上述任 一种可能的实现方式所述的故障处理方法。
第十三方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有程序,运行所述程序使得EMS执行第三方面所述的故障处理方法。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍,显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是适用本发明实施例的故障处理方法的网络功能虚拟化系统的示意性架构图;
图2是现有的故障处理方法的示意性流程图;
图3是根据本发明实施例的故障处理方法的示意性流程图;
图4是根据本发明实施例的VNFM的示意性框图;
图5是根据本发明另一实施例的VNFM的示意性框图;
图6是根据本发明实施例的VIM的示意性框图;
图7是根据本发明另一实施例的VIM的示意性框图;
图8是根据本发明实施例的EMS的示意性框图;
图9是根据本发明另一实施例的EMS的示意性框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述。
首先对本发明实施例的故障处理方法的适用的NFV系统进行说明。
图1示出了适用本发明实施例的故障处理方法的NFV系统100的示意性架构图,该NFV系统可以通过多种网络实现,例如数据中心网络、服务提供者网络、或者局域网(Local Area Network,LAN)。如图1所示,该NFV系统100可以包括:
管理和编制系统(Management and Orchestration System,MANO)128,
NFV基础设施(NFV Infrastructure,NFVI)130,
多个虚拟网络功能(Virtualized Network Function,VNF)108,
多个网元管理系统(Element Management System,EMS)122,
服务、虚拟网络功能和基础设施描述(Service VNF and Infrastructure Description)系统126,
一个或多个运营支撑系统/业务支撑系统(Operation Support System/Business Support System,OSS/BSS)124。
其中,MANO 128可以包括编排器(Orchestrator)102、一个或多个VNF管理器(VNF Manager,VNFM)104,以及一个或多个虚拟化基础设施管理器(Virtualized Infrastructure Manager,VIM)106。
NFVI 130可以包括计算硬件112、存储硬件114、网络硬件116组成的硬件资源层、虚拟化层、以及虚拟计算110(例如,虚拟机)、虚拟存储118和虚拟网络120组成的虚拟资源层。其中,计算硬件112可以为专用的处理器或通用的用于提供处理和计算功能的处理器。存储硬件114用于提供存储能力,该存储能力可以是存储硬件114本身提供的(例如一台服务器的本地内存),也可以通过网络提供(例如服务器通过网络连接一个网络存储设备)。网络硬件116可以是交换机、路由器和/或其他网络设备,网络硬件116用于实现多个设备之间的通信,多个设备之间通过无线或有线连接。NFVI130中的虚拟化层用于抽象硬件资源层的硬件资源,将VNF 108和硬件资源所属的物理层解耦,向VNF提供虚拟资源。
如图1所示,虚拟资源可以包括虚拟计算110、虚拟存储118和虚拟网络120。虚拟计算110、虚拟存储118可以以虚拟机或其他虚拟容器的形式向VNF 108提供虚拟资源。虚拟化层通过抽象网络硬件116形成虚拟网络120。虚拟网络120,例如虚拟交换机(例如,Vswitches),用于实现多个虚拟机之间,或多个承载VNF的其他类型的虚拟容器之间的通信。网络硬件的虚拟化可以通过虚拟LAN(VLAN,Virtual LAN)、虚拟专用局域网业务(VPLS,Virtual Private LAN Service)、虚拟可扩展局域网(VxLAN,Virtual extensible Local Area Network)或通用路由封装网络虚拟化(NVGRE,Network Virtualization using Generic Routing Encapsulation)等技术实现。
OSS/BSS124主要面向电信运营商,提供综合的网络管理和业务运营功能,包括网络管理(例如故障监控、网络信息收集等)、计费管理以及客户服务管理等。Service VNF and Infrastructure Description系统126在ETSI GS NFV 002v1.1.1标准中有详细介绍,本发明实施例在此不再赘述。
MANO 128可以用于实现VNF 108和NFVI 130的监控和管理。编制器102可以与一个或多个VNFM 104通信以实现与资源相关的请求、发送配置信息给VNFM 104、以及收集VNF 108的状态信息。另外,编排器102还可以与虚拟化基础设施管理器106进行通信以实现资源分配,和/或实现虚拟化硬件资源的配置信息和状态信息的预留和交换。VNFM 104可以用于管理一个或多个VNF 108,执行各种管理功能,例如初始化、更新、查询、和/或终止VNF 108。VIM 106可以用于控制和管理VNF 108和计算硬件112、存储硬件114、网络硬件116、虚拟计算110、虚拟存储118、虚拟网络120的交互。例如,VIM 106可以用于执行资源向VNF 108的分配操作。VNFM 104和VIM 106可以互相通信以交换虚拟化硬件资源配置和状态信息。
NFVI 130包含硬件和软件,二者共同建立虚拟化环境以部署、管理和执行VNF 108。换句话说,硬件资源层和虚拟资源层用于向VNF 108提供虚拟资源,例如虚拟机和/或其他形式的虚拟容器。
如图1所示,VNFM 104可以与VNF 108和EMS 122通信以执行VNF生命周期管理和实现配置/状态信息的交换。VNF 108是至少一个网络功能的虚拟化,该网络功能之前是由物理网络设备提供的。例如,在一种实现方式下,VNF 108可以是一个虚拟化的移动管理实体(Mobility Management Entity,MME)节点,用于提供典型的非虚拟化的MME设备提供的所有网络功能。在另一种实现方式下,VNF 108可以用于实现非虚拟化的MME设备上提供的全部组件中的部分组件的功能。一个虚拟机(或其他形式的虚拟容器)上可以部署有一个或多个VNF 108。EMS 122可以用于管理一个或多个VNF。
在本发明实施例中主要涉及基础设施层(也可称为I层)、虚拟层和业务层,其中基础设施层包括NFVI和VIM,虚拟层包括VNFM、以及该VNFM管辖域内的虚拟资源(虚拟计算、虚拟存储和虚拟网络),业务层包括VNF和EMS。
其中,基础设施层中的虚拟资源包括多个VNFM各自管辖域内的虚拟资源,也就是说,一个VNFM管辖域内的虚拟资源为基础设施层中的虚拟资源中的一部分。VNFM可以用于管理部署在其管辖域内的虚拟资源上的一个或多个VNF的生命周期管理,EMS用于管理VNF上的业务对象。
应理解,基础设施层的原因引起的故障可以传递至虚拟层和业务层,从 而导致虚拟层的故障及业务层的故障;虚拟层的原因引起的故障也可以传递至业务层,从而导致业务层的故障。
下面结合图2以基础设施层的VM处发生的故障为例描述现有的故障处理流程。
201、VM发生故障。
202、VNF发现VM发生异常,例如,发生资源不足的异常,向EMS上报业务异常信息。
203、VIM检测到VM发生故障,例如VM上的资源无法调用。
VIM可以采用心跳检测等方式检测到VM发生故障。
204、VIM将该VM的故障信息反馈给VNFM。
205、VNFM向EMS反馈该故障和该VM对应的VNF的标识。
206、EMS根据VNFM反馈的信息和VNF上报的业务异常信息进行异常分析。
207、EMS根据异常分析结果向VNFM发送相应的HA指令,例如:重启VM。
208、VNFM调用VIM的自愈接口将该HA指令发给VM。
209、VM执行该HA指令。
通过以上流程可知,现有方案中,如果在基础设施层发生故障,则会将故障信息通过VNFM上报至业务层的EMS,由EMS进行决策。由于各个层的故障最终汇总到了业务层,因此会导致故障修复的处理时延变长。
另外,现有方案中业务层的上层网元调用下层网元的接口来触发自愈流程,这样互相调用可能导致结构复杂。
本发明实施例中,基础设施层发生的故障包括硬件资源故障(如主机)或虚拟资源(如虚拟机)故障。VIM分够分析出引起基础设施层的故障的原因,则由VIM进行决策;如果VIM无法分析出该原因,则将故障信息上报至虚拟层的VNFM;同样,如果VNFM能够分析出引起该故障的原因,则由VNFM进行决策;如果VNFM也无法分析出该原因,则将故障信息和该故障信息对应的VNF的标识上报至EMS。
本发明实施例提供了一种故障处理方法,能够将各层的故障恢复解耦,可以由各层解决能够处理的故障,这样极大缩短了故障恢复时间,实现了故障快速最短路径闭环。具体地,由EMS负责业务层可以解决的故障,VNFM 负责决策虚拟层可以解决的故障,VIM负责基础设施层可以解决的故障。
而且,本发明实施例中,上层网元不直接调用下层网元的接口,而是利用通用的接口将本层中与基础设施层中引起故障的原因相关联的异常信息发送至下层网元,由下层网元根据该异常信息决策如何处理故障。
图3所示为根据本发明实施例的故障处理方法的流程示意图。如图3所示,故障处理方法300包括如下内容。
当基础设施层中的对象故障时,业务层和基础设施层都会检测到对应的故障,由对应模块上报,以进行关联分析以确认产生故障的根因,同时触发分层故障自愈操作。具体流程如下:
301、第一对象发生故障。
该第一对象可以为基础设施层中用于向上层业务提供虚拟资源的资源对象,如VM、虚拟网卡(Virtual network card,简称vNIC)、主机(HOST)或单板对象(BOARD)等。
302、VNF检测到运行在该第一对象上的业务异常信息,把第一对象的业务异常信息上报到EMS。
例如,业务异常信息可以包括业务层闭塞、业务中断、业务指标跳变、业务指标突降、大流量冲击等。
同时,VNF根据配置的业务层自愈策略触发业务恢复。在主备组网模式下执行主备倒换动作,备用对象接管故障的第一对象的业务;在负荷分担组网模式下执行业务流重分配动作,将故障的第一对象的流量分配到其他正常对象上。
303、VIM检测到第一对象发生故障。
例如,基础设施层检测到第一对象故障,并把第一对象的故障上报到VIM。
需要说明的是,步骤302和303的先后顺序不作限定,步骤302和303的先后顺序可以由VIM和VNF的检测速度来确定。
304、VIM分析引起该故障的原因,如果能够确定基础设施层中引起该故障的原因,执行305;如果无法确定引起该故障的原因,执行306。
305、VIM执行该原因对应的自愈流程,并判断故障是否已经解决。如果故障已解决,则流程结束;如果故障未解决,则执行步骤306。
例如,可以在VIM中预先配置对应不同原因的默认自愈流程。该默认 的自愈流程可以是一种指定的自愈流程,例如软重启或硬重启,也可以是具有优先顺序的几种自愈流程,例如,VIM检测到第一对象发生故障,则默认执行的软重启,若无法解决故障,则默认继续执行硬重启,若仍无法解决故障,则默认继续执行本地重建等。
在另一种实施例中,VNF将业务异常信息上报到EMS之后,EMS可以根据VNF上报的业务异常信息触发VIM执行针对该第一对象的故障的自愈流程。这样VIM可能会重复执行自愈流程处理故障。
可选地,为了避免在故障已经解决的情况下重复执行自愈流程,在步骤305之前,VIM还可以判断该故障是否已解决:如果该故障尚未解决,则执行步骤305;如果该故障已解决,则流程结束。
具体地,VIM可以维护自愈操作的历史记录信息,每次的自愈流程都保存一条记录,例如该记录可以包括操作对象的标识、自愈操作、操作时间、自愈结果等信息。可以通过检测对应告警是否已恢复来确定自愈结果。这样,VIM就可以根据是否记录有对应对象相关的自愈操作记录来确定故障是否已经解决。例如,VIM可以根据第一对象的标识确定第一对象的故障是否已经解决。
也就是说,如果VIM能够分析出引起该故障的原因,则执行相应的自愈流程后能够解决该故障。如果该VIM无法分析出引起故障的原因,则VIM向VNFM上报故障信息。应注意,在另一种实施例中,也可能存在VIM误判的情况,即VIM分析出引起该故障的原因,但是执行与该原因对应的自愈流程后发现该故障并未解决,则VIM继续向VNFM上报该故障信息。
306、VIM向VNFM上报故障信息,该故障信息用于指示该第一对象发生的故障。
307、VNFM根据VNF与VM的映射关系确定该第一对象关联的VNF,即部署在该第一对象上的VNF。
308、VNFM根据VIM上报的故障信息和该VNF进行基础设施层和虚拟层的故障相关性分析,如果VNFM能够分析出虚拟层中与引起该故障的原因相关联的异常信息,执行311;如果VNFM无法分析出虚拟层中与引起该故障的原因相关联的异常信息,则执行309。
例如,VNFM分析出虚拟层中与引起第一对象的故障的原因相关联的异常信息可以为:GuestOS内存泄露、CPU占用率高、网口故障或者交换机故 障等等。
同时VNFM还确定该异常信息涉及的虚拟层对象,该虚拟层对象与第一对象中引起该故障的目标对象相关联。这里的关联指的是,该虚拟层对象的异常信息与该目标对象的故障存在关联性,即基础设施层中目标对象的故障会导致虚拟层对象发生故障或异常。当然,基础设施层中目标对象的故障还会导致业务层中的业务对象故障。
309、VNFM向EMS上报故障信息和步骤307中VNFM确定的VNF的标识。
310、EMS根据VNFM上报的故障和VNF的标识指示的VNF、以及VNF上报的业务异常信息,进行业务层、虚拟层、基础设施层的故障或异常信息的相关性分析,确定业务层的异常信息,向VNFM发送第二指示消息,该第二指示消息包括业务层的异常信息和该异常信息对应的业务层的第三对象的标识。其中,业务层的异常信息与基础设施层中引起该故障的原因相关联,第三对象与基础设施层中引起故障的目标对象相关联,第三对象为该VNF上的业务对象。
例如,EMS能够确定业务层的异常信息为没有足够的资源支持请求的业务等。
可选地,该第二指示消息还可以包括该VNF的标识。该VNF的标识可以用于VNFM定位该故障。
311、VNFM向VIM发送第一指示消息,该第一指示消息包括虚拟层的异常信息和该异常信息对应的虚拟层的第二对象的标识,该异常信息与基础设施层中引起该故障的原因相关联,第二对象与基础设施层中引起该故障的目标对象相关联,其中第二对象为VNFM确定的该VNF提供虚拟资源。
其中该异常信息可以包括异常信息的标识(Identifier,ID)和异常信息的产生时间。
第一指示消息可以是VNFM根据步骤306中VIM发送的故障信息和步骤307中VNFM确定的VNF生成的。
第一指示消息还可以是VNFM根据虚拟层与业务层之间的映射关系和步骤310中EMS发送的第二指示消息生成的。具体地,VNFM根据业务层的异常信息确定虚拟层的异常信息,根据业务层的第三对象确定虚拟层的第二对象。
由于在NFV系统中,下层只能感知到直接上层,例如,VIM只能感知到VNFM所在虚拟层的相关对象,而无法感知到EMS所在的业务层的相关对象,因此,EMS与VNFM间的指示消息和VNFM与VIM间的指示消息不同。
可选地,第一指示消息还可以包括第一对象的标识和在第一对象关联的租户标识(tenant_id)。该第一对象的标识和该第一对象关联的tenant_id用于VIM定位故障。
可选地,在另一种实施例中,为了避免在故障已经解决的情况下重复执行自愈流程,在步骤311之前,VNFM还可以判断故障是否已经解决;如果故障尚未解决,则执行步骤311;如果故障已经解决,则流程结束。
具体地,VNFM可以维护其管辖范围内的多个对象(如VM)的自愈操作的历史记录信息,每次的自愈流程都保存一条记录,例如该记录可以包括操作对象的标识、自愈操作、操作时间、自愈结果等信息。其中,可以通过检测对应告警是否已恢复来确定自愈结果。这样,VNFM就可以根据是否记录有对应对象相关的自愈操作记录来确定故障是否已经解决。例如,VNFM可以根据第一对象的标识确定故障是否已经解决。
312、VIM根据接收到的第一指示消息中的虚拟层的异常信息确定基础设施层中引起该故障的原因,并根据第二对象的标识确定基础设施层中引起该故障的目标对象,并根据引起故障的原因执行针对该目标对象的自愈流程,处理故障。
其中,该目标对象可以为第一对象中的单元或模块。例如,如果第一对象为VM,则目标对象可以为VM中的虚拟端口、虚拟网络等。但本发明实施例对此并不限定,该目标对象还可以为第一对象本身。
可选地,在另一种实施例中,为了避免在故障已经解决的情况下重复执行自愈流程,在步骤312之前,VIM可以判断故障是否已经解决;如果故障尚未解决,则执行步骤312;如果故障已经解决,则流程结束。VIM判断故障是否已经解决可以采用前面所述的方法,此处不再赘述。
应注意,本发明实施例中对承载指示消息的协议不作限定,例如,该指示消息可以基于超文本传输协议(Hyper Text Transfer Protocol,HTTP)协议承载,还可以基于其他协议承载。本发明实施例中的EMS向VNFM发送的指示消息和VNFM向VIM发送的指示消息都可以称为SLA Complain消息。
下面以SLA Complain为例描述根据本发明实施例的指示消息。
例如,SLA Complain消息基于超文本传输协议(Hyper Text Transfer Protocol,HTTP)协议承载,采用表述性状态转移(Representational State Transfer,REST)机制。
EMS与VNFM间的指示消息与VNFM与VIM之间的指示消息的请求格式不同。
EMS与VNFM间的SLA Complain消息的请求格式如下表1所示。
表1、EMS与VNFM间的指示消息的请求格式
Figure PCTCN2016081229-appb-000001
EMS与VNFM间的SLA Complain消息的请求参数格式如表2所示。
表2、EMS与VNFM间的SLA Complain消息的请求参数格式
Figure PCTCN2016081229-appb-000002
其中,表2中的vapp实例指的是基础设施层中发生故障的对象关联的VNF。异常信息对应的业务层对象指的是EMS能够感知到的VNF上的业务。
VNFM接收到EMS发送的SLA Complain消息之后,根据业务层与虚拟层之间的映射关系和SLA Complain消息,可以确定虚拟层的异常信息和对应的虚拟层对象。
VNFM与VIM间的SLA Complain消息的请求格式如表3所示。
表3、VNFM与VIM间的SLA Complain消息的请求格式
Figure PCTCN2016081229-appb-000003
表4、VNFM与VIM间的SLA Complain消息的请求参数格式
Figure PCTCN2016081229-appb-000004
其中,租户的通用唯一识别码(Universally Unique Identifier,简称UUID)用于资源划分。具体而言,基础设施层的数据中心(Data Center,简称DC)可以按照地理区域将资源分成多个域(domain),每个domain又划分为多个tenant,每个tenant包括一组逻辑资源(计算/存储/网络)。引起故障的原因涉及的虚拟层对象与基础设施层的第一对象中引起故障的目标对象相关联。
VIM接收到VNFM发送的该SLA Complain消息之后,可以根据租户的UUID和虚拟机的UUID定位基础设施层中的发生故障的对象(如第一对象),然后根据虚拟层与基础设施层之间的映射关系,确定基础设施层中引起故障的原因和目标对象。
例如,SLA Complain消息采用JSON方式通信。则EMS与VNFM间的SLA Complain消息的请求实施例如下所示:
put/v2/vapps/instances/1234567890/sla_complain HTTP/1.1
Host:172.28.1.2:35357
Content-Type:application/json
charset=UTF-8
{
“vapp_id”:“1234567890abcdef”,
“related_object_id”:“1234567890abcdef”,
“exception_info”:[
{
"alarmid":"ALM-70101",
"occurtime":"2015-02-03 00:00:00"
}
]
}
响应消息实施例:
HTTP/1.1 200OK
应理解,以上实施例中,“vapp_id”:“1234567890abcdef”、“related_object_id”:“1234567890abcdef”仅为示意性的描述,并不意味着两者的ID相同。
VNFM与VIM间的SLA Complain消息请求实施例如下所示:
post/v2/1234567890/servers/1234567890/sla_complain HTTP/1.1
Host:172.28.1.2:35357
Content-Type:application/json
charset=UTF-8
X-Auth-Token:2012
{
“tenant_id”:“1234567890abcdef”,
“server_id”:“1234567890abcdef”,
“related_object_id”:“1234567890abcdef”,
“exception_info”:[
{
"alarmid":"ALM-70101",
"occurtime":"2015-02-03 00:00:00"
}
]
}
响应消息实施例:
HTTP/1.1 200OK
同样,以上实施例中,“tenant_id”:“1234567890abcdef”、“server_id”:“1234567890abcdef”、“related_object_id”:“1234567890abcdef”仅为示意性的描述,并不意味着三者的ID相同。
下面结合附图描述根据本发明实施例的故障处理装置。
图4是根据本发明实施例的VNFM 400的示意性框图。VNFM 400包括:
接收单元410,用于接收VIM发送的故障信息,故障信息用于指示基础设施层的第一对象发生的故障;
确定单元420,用于确定第一对象关联的VNF;
发送单元430,用于根据接收单元410接收到的故障信息和确定单元420确定的VNF向VIM发送第一指示消息,第一指示消息包括虚拟层的第一异常信息和第一异常信息对应的虚拟层的第二对象的标识,第一异常信息与基础设施层中引起故障的原因相关联,第二对象与基础设施层中引起故障的目标对象相关联,第二对象为VNF提供虚拟资源。
其中,第一指示消息可用于指示VIM根据基础设施层与虚拟层之间的映射关系、第一异常信息和第二对象的标识确定基础设施层中引起故障的原因和目标对象,并处理故障。
具体地,发送单元430通过通用的接口向VIM发送第一指示消息。
本发明实施例中,VNFM将与基础设施层中引起故障的原因相关联的虚拟层的异常信息发送至VIM,使得VIM能够根据该异常信息确定基础设施层的原因,并执行相应的故障处理流程,这样能够避免跨层调用自愈接口触发自愈流程,从而能够降低系统的复杂度。
可选地,确定单元420还用于,根据故障信息和VNF分析是否能够确定第一异常信息。
相应地,发送单元430具体用于,如果确定单元420能够确定该第一异常信息,则向VIM发送第一指示消息。
此时无需向EMS上报故障信息,因此能够将各层的故障解耦,极大缩短了故障恢复时间。
或者,发送单元430具体用于,如果确定单元420根据故障信息和VNF无法确定第一异常信息,则向网元管理系统EMS发送故障信息和VNF的标识。相应地,接收单元410还用于,接收EMS发送的第二指示消息,第二 指示消息包括业务层的第二异常信息和第二异常信息对应的业务层的第三对象的标识,第二异常信息与基础设施层中引起故障的原因相关联,第三对象与基础设施层中引起故障的目标对象相关联,第三对象为VNF上的业务对象;确定单元420还用于,根据虚拟层与业务层之间的映射关系和接收单元接收到的第二指示消息生成第一指示消息;发送单元430具体用于向VIM发送第一指示消息。
应注意,本发明实施例中,接收单元410可以由接收器实现,发送单元430可以由发送器实现,确定单元420可以由处理器实现。如图5所示,根据本发明实施例的VNFM可以包括处理器510、存储器520、接收器530、发送器540和总线系统550。可选地,存储器520可以用于存储处理器510执行的指令或代码等。
VNFM 500中的各个组件通过总线系统530耦合在一起,其中总线系统550除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线。
图4所示的VNFM 400或图5所示的VNFM 500能够实现前述方法实施例中由VNFM所实现的各个过程,为避免重复,这里不再赘述。
应注意,本发明上述方法实施例可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本发明实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以 是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
图6是根据本发明另一实施例的VIM 600的示意性框图。VIM 600包括:
确定单元610,用于确定基础设施层的第一对象发生故障;
确定单元610还用于,分析是否能够确定引起故障的原因;
发送单元620,用于如果确定单元610无法确定引起故障的原因,或者确定单元610根据确定的引起故障的第一原因执行第一自愈流程后无法解决故障,则向VNFM发送故障信息,故障信息用于指示第一对象发生的故障;
接收单元630,用于接收VNFM发送的第一指示消息,第一指示消息包括虚拟层的第一异常信息和第一异常信息对应的虚拟层的第二对象的标识;
确定单元610还用于,根据基础设施层和虚拟层之间的映射关系、第一异常信息和第二对象的标识,确定基础设施层中引起故障的原因和目标对象。
可以理解,VIM 600在获知目标对象和故障的原因后可以有针对性的执行第二自愈流程,处理故障。
本发明实施例的VIM,VIM在无法解决故障时,向VNFM发送故障信息,使得能够由VNFM或EMS进行相关性分析,从而能够确定引起故障的原因,并执行相应的自愈流程,实现故障恢复。
可选地,第一指示消息是VNFM根据故障信息和第一对象关联的虚拟网络功能VNF生成的。
可选地,第一指示消息是VNFM根据虚拟层与业务层之间的映射关系和从EMS接收到的第二指示消息生成的,第二指示消息是EMS根据从VNFM接收到的故障信息和第一对象关联的VNF、以及从VNF接收到的业务异常信息生成的,第二指示消息包括业务层的第二异常信息和第二异常信息对应的业务层的第三对象的标识,第二异常信息与基础设施层中引起故障的原因相关联,第三对象与基础设施层中引起故障的目标对象相关联,第三对象为VNF上的业务对象。
应注意,本发明实施例中,确定单元610可以由处理器实现,发送单元620可以由发送器实现,接收单元630可以由接收器实现。如图7所示,根据本发明实施例的VIM可以包括处理器710、存储器720、接收器730、发送器740和总线系统750。可选地,存储器720可以用于存储处理器710执行的指令或代码等。
VIM 700中的各个组件通过总线系统750耦合在一起,其中总线系统750除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线。
图6所示的VIM 600或图7所示的VIM 700能够实现前述方法实施例中由VIM所实现的各个过程,为避免重复,这里不再赘述。
应注意,本发明上述方法实施例可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本发明实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是ROM、PROM、EPROM、EEPROM或闪存。易失性存储器可以是RAM, 其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如SRAM、DRAM、SDRAM、DDR SDRAM、ESDRAM、SLDRAM和DR RAM。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
图8是根据本发明另一实施例的EMS 800的示意性框图。EMS 800包括:
接收单元810,用于接收VNFM发送的故障信息和VNF的标识,故障信息用于指示基础设施层的第一对象发生的故障,VNF与第一对象相关联;
接收单元810还用于,接收VNF发送的业务异常信息;
发送单元820,用于根据接收单元810接收到的故障信息、VNF和业务异常信息,向VNFM发送第二指示消息,第二指示消息包括业务层的第二异常信息和第二异常信息对应的业务层的第三对象的标识,第二异常信息与基础设施层中引起故障的原因相关联,第三对象与基础设施层中引起故障的目标对象相关联,第三对象为VNF上的业务对象。
其中,第二指示消息可用于指示VNFM根据业务层与虚拟层之间的映射关系和第二指示消息生成用于处理故障的第一指示消息,并发送至虚拟化基础设施管理器VIM,第一指示消息包括虚拟层的第一异常信息和第一异常信息对应的虚拟层的第二对象的标识。
本发明实施例中,EMS根据基础设施层、虚拟层和业务层的故障或异常信息进行相关性分析,能够确定与基础设施层中引起故障的原因相关联的业务层的异常信息,然后将该业务层的异常信息发送至VNFM,由VNFM执行相应的处理流程,能够解决基础设施层的故障,实现故障恢复。
可选地,故障处理装置800还可以包括确定单元830,用于根据接收单元810接收到的故障信息、VNF的标识和业务异常信息,进行故障相关性分析,确定第二异常信息和第三对象的标识,并生成第二指示消息。
应注意,本发明实施例中,接收单元810可以由接收器实现,发送单元820可以由发送器实现,确定单元830可以由处理器实现。如图9所示,根据本发明实施例的故障处理装置可以包括处理器910、存储器920、接收器930、发送器940和总线系统950。可选地,存储器920可以用于存储处理器910执行的指令或代码等。
故障处理装置900中的各个组件通过总线系统950耦合在一起,其中总 线系统950除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线。
图8所示的故障处理装置800或图9所示的故障处理装置900能够实现前述方法实施例中由EMS所实现的各个过程,为避免重复,这里不再赘述。
应注意,本发明上述方法实施例可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本发明实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是ROM、PROM、EPROM、EEPROM或闪存。易失性存储器可以是RAM,其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如SRAM、DRAM、SDRAM、DDR SDRAM、ESDRAM、SLDRAM和DR RAM。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述 描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本发明实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本发明可以用硬件实现,或固件实现,或它们的组合方式来实现。当使用软件实现时,可以将上述功能存储在计算机可读介质中或作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是计算机能够存取的任何可用介质。以此为例但不限于:计算机可读介质可以包括RAM、ROM、EEPROM、CD-ROM或其他光盘存储、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质。此外。任何连接可以适当的成为计算机可读介质。例如,如果软件是使用同轴电缆、光纤光缆、双绞线、数字用户线(英文Digital Subscriber Line,简称DSL)或者诸如红外线、无线电和微波之类的无线技术从网站、服务器或者其他远程源传输的,那么同轴电缆、光纤光缆、双绞线、DSL或者诸如红外线、无线和微波之类的无线技术包括在所属介质的定影中。如本 发明所使用的,盘(Disk)和碟(disc)包括压缩光碟(CD)、激光碟、光碟、数字通用光碟(DVD)、软盘和蓝光光碟,其中盘通常磁性的复制数据,而碟则用激光来光学的复制数据。上面的组合也应当包括在计算机可读介质的保护范围之内。
总之,以上仅为本发明技术方案的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (14)

  1. 一种故障处理方法,其特征在于,包括:
    虚拟网络功能管理器VNFM接收虚拟化基础设施管理器VIM发送的故障信息,所述故障信息用于指示基础设施层的第一对象发生的故障;
    所述VNFM确定所述第一对象关联的虚拟网络功能VNF;
    所述VNFM根据所述故障信息和所述VNF向所述VIM发送第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的虚拟层的第二对象的标识,所述第一异常信息与所述基础设施层中引起所述故障的原因相关联,所述第二对象与所述基础设施层中引起所述故障的目标对象相关联,所述第二对象为所述VNF提供虚拟资源。
  2. 根据权利要求1所述的故障处理方法,其特征在于,所述VNFM根据所述故障信息和所述VNF向所述VIM发送第一指示消息包括:
    如果所述VNFM根据所述故障信息和所述VNF能够确定所述第一异常信息,则向所述VIM发送所述第一指示消息。
  3. 根据权利要求1所述的故障处理方法,其特征在于,所述VNFM根据所述故障信息和所述VNF向所述VIM发送第一指示消息包括:
    如果所述VNFM根据所述故障信息和所述VNF无法确定所述第一异常信息,则向网元管理系统EMS发送所述故障信息和所述VNF的标识;
    所述VNFM接收所述EMS发送的第二指示消息,所述第二指示消息是所述EMS根据所述故障信息、所述VNF和所述VNF向所述EMS发送的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象;
    所述VNFM根据所述虚拟层与所述业务层之间的映射关系和所述第二指示消息生成所述第一指示消息;
    所述VNFM向所述VIM发送所述第一指示消息。
  4. 一种故障处理方法,其特征在于,包括:
    虚拟化基础设施管理器VIM确定基础设施层的第一对象发生故障;
    如果所述VIM无法确定所述基础设施层中引起所述故障的原因,或者 所述VIM根据确定的引起所述故障的第一原因执行第一自愈流程后无法解决所述故障,则向虚拟网络功能管理器VNFM发送故障信息,所述故障信息用于指示所述第一对象发生的故障;
    所述VIM接收所述VNFM发送的第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的所述虚拟层的第二对象的标识;
    所述VIM根据所述基础设施层和所述虚拟层之间的映射关系、所述第一异常信息和所述第二对象的标识,确定所述原因和所述基础设施层中引起所述故障的目标对象。
  5. 根据权利要求4所述的故障处理方法,其特征在于,所述第一指示消息是所述VNFM根据所述故障信息和所述第一对象对应的虚拟网络功能VNF生成的。
  6. 根据权利要求4所述的故障处理方法,其特征在于,所述第一指示消息是所述VNFM根据所述虚拟层与业务层之间的映射关系和从网元管理系统EMS接收到的第二指示消息生成的,
    所述第二指示消息是所述EMS根据从所述VNFM接收到的所述故障信息和所述第一对象对应的VNF、以及从所述VNF接收到的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象。
  7. 一种故障处理方法,其特征在于,包括:
    网元管理系统EMS接收虚拟网络功能管理器VNFM发送的故障信息和虚拟网络功能VNF的标识,所述故障信息用于指示基础设施层的第一对象发生的故障,所述VNF与所述第一对象相关联;
    所述EMS接收所述VNF发送的业务异常信息;
    所述EMS根据所述故障信息、所述VNF和所述业务异常信息,向所述VNFM发送第二指示消息,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述基础设施层中引起所述故障的原因相关联,所述第三对象与所述基础设施层中引起所述故障的目标对象相关联,所述第三对象为所述VNF上的 业务对象。
  8. 一种虚拟网络功能管理器VNFM,其特征在于,包括:
    接收单元,用于接收虚拟化基础设施管理器VIM发送的故障信息,所述故障信息用于指示基础设施层的第一对象发生的故障;
    确定单元,用于确定所述第一对象对应的虚拟网络功能VNF;
    发送单元,用于根据所述接收单元接收到的所述故障信息和所述确定单元确定的所述VNF向所述VIM发送第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的虚拟层的第二对象的标识,所述第一异常信息与所述基础设施层中引起所述故障的原因相关联,所述第二对象与所述基础设施层中引起所述故障的目标对象相关联,所述第二对象为所述VNF提供虚拟资源。
  9. 根据权利要求8所述的VNFM,其特征在于,
    所述确定单元还用于,根据所述故障信息和所述VNF分析是否能够确定所述第一异常信息;
    所述发送单元具体用于,如果所述确定单元能够确定所述第一异常信息,则向所述VIM发送所述第一指示消息。
  10. 根据权利要求8所述的VNFM,其特征在于,
    所述确定单元还用于,根据所述故障信息和所述VNF分析是否能够确定所述第一异常信息;
    所述发送单元具体用于,如果所述确定单元无法确定所述第一异常信息,则向网元管理系统EMS发送所述故障信息和所述VNF的标识;
    所述接收单元还用于,接收所述EMS发送的第二指示消息,所述第二指示消息是所述EMS根据所述故障信息、所述VNF和所述VNF向所述EMS发送的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象;
    所述确定单元还用于,根据所述虚拟层与所述业务层之间的映射关系和所述接收单元接收到的所述第二指示消息生成所述第一指示消息;
    所述发送单元具体用于向所述VIM发送所述第一指示消息。
  11. 一种虚拟化基础设施管理器VIM,其特征在于,包括:
    确定单元,用于确定基础设施层的第一对象发生故障;
    所述确定单元还用于,分析是否能够确定引起所述故障的原因;
    发送单元,用于如果所述确定单元无法确定所述基础设施层中引起所述故障的原因,或者所述确定单元根据确定的引起所述故障的第一原因执行第一自愈流程后无法解决所述故障,则向虚拟网络功能管理器VNFM发送故障信息,所述故障信息用于指示所述第一对象发生的故障;
    接收单元,用于接收所述VNFM发送的第一指示消息,所述第一指示消息包括虚拟层的第一异常信息和所述第一异常信息对应的所述虚拟层的第二对象的标识;
    所述确定单元还用于,根据所述基础设施层和所述虚拟层之间的映射关系、所述第一异常信息和所述第二对象的标识,确定所述原因和所述基础设施层中引起所述故障的目标对象。
  12. 根据权利要求11所述的VIM,其特征在于,所述第一指示消息是所述VNFM根据所述故障信息和所述第一对象对应的虚拟网络功能VNF生成的。
  13. 根据权利要求11所述的VIM,其特征在于,所述第一指示消息是所述VNFM根据所述虚拟层与业务层之间的映射关系和从网元管理系统EMS接收到的第二指示消息生成的,
    所述第二指示消息是所述EMS根据从所述VNFM接收到的所述故障信息和所述第一对象对应的VNF、以及从所述VNF接收到的业务异常信息生成的,所述第二指示消息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述原因相关联,所述第三对象与所述目标对象相关联,所述第三对象为所述VNF上的业务对象。
  14. 一种网元管理系统EMS,其特征在于,包括:
    接收单元,用于接收虚拟网络功能管理器VNFM发送的故障信息和虚拟网络功能VNF的标识,所述故障信息用于指示基础设施层的第一对象发生的故障,所述VNF与所述第一对象相关联;
    所述接收单元还用于,接收所述VNF发送的业务异常信息;
    发送单元,用于根据所述接收单元接收到的所述故障信息、所述VNF和所述业务异常信息,向所述VNFM发送第二指示消息,所述第二指示消 息包括业务层的第二异常信息和所述第二异常信息对应的所述业务层的第三对象的标识,所述第二异常信息与所述基础设施层中引起所述故障的原因相关联,所述第三对象与所述基础设施层中引起所述故障的目标对象相关联,所述第三对象为所述VNF上的业务对象。
PCT/CN2016/081229 2016-05-06 2016-05-06 故障处理方法及装置 WO2017190339A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/081229 WO2017190339A1 (zh) 2016-05-06 2016-05-06 故障处理方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/081229 WO2017190339A1 (zh) 2016-05-06 2016-05-06 故障处理方法及装置

Publications (1)

Publication Number Publication Date
WO2017190339A1 true WO2017190339A1 (zh) 2017-11-09

Family

ID=60202771

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/081229 WO2017190339A1 (zh) 2016-05-06 2016-05-06 故障处理方法及装置

Country Status (1)

Country Link
WO (1) WO2017190339A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110618884A (zh) * 2018-06-19 2019-12-27 中国电信股份有限公司 故障监控方法、虚拟化的网络功能模块管理器和存储介质
CN111416726A (zh) * 2019-01-07 2020-07-14 中国移动通信有限公司研究院 一种资源管理的方法、发送端设备和接收端设备
CN111641519A (zh) * 2020-04-30 2020-09-08 平安科技(深圳)有限公司 异常根因定位方法、装置及存储介质
CN113542034A (zh) * 2021-07-28 2021-10-22 山石网科通信技术股份有限公司 网元信息处理系统、网元管理方法及装置
WO2023025180A1 (zh) * 2021-08-27 2023-03-02 华为技术有限公司 管理节点的方法、节点和系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101924661A (zh) * 2009-06-17 2010-12-22 中兴通讯股份有限公司 告警的处理方法及装置
CN104170323A (zh) * 2014-04-09 2014-11-26 华为技术有限公司 基于网络功能虚拟化的故障处理方法及装置、系统
CN104268061A (zh) * 2014-09-12 2015-01-07 国云科技股份有限公司 一种适用于虚拟机的存储状态监控机制
WO2015135611A1 (en) * 2014-03-10 2015-09-17 Nokia Solutions And Networks Oy Notification about virtual machine live migration to vnf manager
CN105049293A (zh) * 2015-08-21 2015-11-11 中国联合网络通信集团有限公司 监控的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101924661A (zh) * 2009-06-17 2010-12-22 中兴通讯股份有限公司 告警的处理方法及装置
WO2015135611A1 (en) * 2014-03-10 2015-09-17 Nokia Solutions And Networks Oy Notification about virtual machine live migration to vnf manager
CN104170323A (zh) * 2014-04-09 2014-11-26 华为技术有限公司 基于网络功能虚拟化的故障处理方法及装置、系统
CN104268061A (zh) * 2014-09-12 2015-01-07 国云科技股份有限公司 一种适用于虚拟机的存储状态监控机制
CN105049293A (zh) * 2015-08-21 2015-11-11 中国联合网络通信集团有限公司 监控的方法及装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110618884A (zh) * 2018-06-19 2019-12-27 中国电信股份有限公司 故障监控方法、虚拟化的网络功能模块管理器和存储介质
CN111416726A (zh) * 2019-01-07 2020-07-14 中国移动通信有限公司研究院 一种资源管理的方法、发送端设备和接收端设备
CN111641519A (zh) * 2020-04-30 2020-09-08 平安科技(深圳)有限公司 异常根因定位方法、装置及存储介质
CN111641519B (zh) * 2020-04-30 2022-10-11 平安科技(深圳)有限公司 异常根因定位方法、装置及存储介质
CN113542034A (zh) * 2021-07-28 2021-10-22 山石网科通信技术股份有限公司 网元信息处理系统、网元管理方法及装置
CN113542034B (zh) * 2021-07-28 2024-03-19 山石网科通信技术股份有限公司 网元信息处理系统、网元管理方法及装置
WO2023025180A1 (zh) * 2021-08-27 2023-03-02 华为技术有限公司 管理节点的方法、节点和系统

Similar Documents

Publication Publication Date Title
US11296960B2 (en) Monitoring distributed applications
WO2017190339A1 (zh) 故障处理方法及装置
JP6466003B2 (ja) Vnfフェイルオーバの方法及び装置
US10868883B2 (en) Upgrading a proxy that decouples network connections from an application during application's downtime
US20150100958A1 (en) Traffic migration acceleration for overlay virtual environments
US20140129700A1 (en) Creating searchable and global database of user visible process traces
US11102278B2 (en) Method for managing a software-defined data center implementing redundant cloud management stacks with duplicate API calls processed in parallel
WO2018137520A1 (zh) 一种业务恢复方法及装置
US20190155632A1 (en) Self-managed virtual networks and services
US11582083B2 (en) Multi-tenant event sourcing and audit logging in a cloud-based computing infrastructure
US11838176B1 (en) Provisioning and deploying RAN applications in a RAN system
US10587673B2 (en) Decoupling network connections from an application while the application is temporarily down
CN107360015B (zh) 切换共享存储的方法和设备
US20230056683A1 (en) Quantum Key Distribution Network Security Survivability
Lee et al. Fault localization in NFV framework
US20160378816A1 (en) System and method of verifying provisioned virtual services
US11295011B2 (en) Event-triggered behavior analysis
US8769062B2 (en) Determining a network address for managed devices to use to communicate with manager server in response to a change in a currently used network address
US10970152B2 (en) Notification of network connection errors between connected software systems
US20200057666A1 (en) Agentless Personal Network Firewall in Virtualized Datacenters
US20240205077A1 (en) Using generic key-value pairs to configure ran component attributes in a ran system
US20240205697A1 (en) Differential management for updated configurations in a ran system
US20240205068A1 (en) Using generic key-value pairs to configure ran component attributes in a ran system
US20240205807A1 (en) Admission control for ric components in a ran system
WO2024098938A1 (zh) 网络存储功能故障检测及容灾方法及相关设备

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16900861

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16900861

Country of ref document: EP

Kind code of ref document: A1