WO2015042937A1 - 故障管理的方法、实体和系统 - Google Patents

故障管理的方法、实体和系统 Download PDF

Info

Publication number
WO2015042937A1
WO2015042937A1 PCT/CN2013/084686 CN2013084686W WO2015042937A1 WO 2015042937 A1 WO2015042937 A1 WO 2015042937A1 CN 2013084686 W CN2013084686 W CN 2013084686W WO 2015042937 A1 WO2015042937 A1 WO 2015042937A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
information
entity
comprehensive
nfvi
Prior art date
Application number
PCT/CN2013/084686
Other languages
English (en)
French (fr)
Inventor
刘建宁
朱雷
余芳
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020167010730A priority Critical patent/KR101908465B1/ko
Priority to EP17191853.5A priority patent/EP3322125B1/en
Priority to CN201810143222.7A priority patent/CN108418711B/zh
Priority to PCT/CN2013/084686 priority patent/WO2015042937A1/zh
Priority to RU2016117218A priority patent/RU2644146C2/ru
Priority to BR112016006902-1A priority patent/BR112016006902B1/pt
Priority to JP2016517300A priority patent/JP6212207B2/ja
Priority to EP13894185.1A priority patent/EP3024174B1/en
Priority to CN201380002104.XA priority patent/CN104685830B/zh
Publication of WO2015042937A1 publication Critical patent/WO2015042937A1/zh
Priority to US15/084,548 priority patent/US10073729B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0712Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0772Means for error signaling, e.g. using interrupts, exception flags, dedicated error registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0775Content or structure details of the error report, e.g. specific table structure, specific error fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/02Standardisation; Integration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0604Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
    • H04L41/0627Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time by acting on the notification or alarm source
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0686Additional information in the notification, e.g. enhancement of specific meta-data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0894Policy-based network configuration management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0895Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/40Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/065Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0896Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
    • H04L41/0897Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities by horizontal or vertical scaling of resources, or by migrating entities, e.g. virtual resources or entities

Definitions

  • the present invention relates to the field of communications and, more particularly, to methods, entities and systems for fault management. Background technique
  • NFV Network Function Virtulization
  • E2E NFV End to End
  • VNF Virtual Network Function
  • VIM Virtualization Management System
  • VNF manager VNF manager entities
  • the embodiment of the invention provides a fault management method, which can implement fault reporting and processing in an NFV environment.
  • a fault management method including: a virtualized infrastructure manager
  • the VIM acquires first fault information of the network function virtualization infrastructure NFVI entity, including the fault entity identifier and the fault type, where the first fault information is used to indicate that the first NFVI entity having the fault entity identifier fails; the VIM Generating, according to the first fault information, first fault comprehensive information, where the first fault comprehensive information includes the first fault information and the associated fault information of the first fault information; the VIM is integrated according to the first fault The information is repaired or reported.
  • the acquiring, by the VIM, the first fault information that includes the fault entity identifier and the fault type of the NFVI entity includes: receiving the first fault sent by the first NFVI entity And determining that the first NFVI entity fails, and generating the first fault information according to a fault that occurs in the first NFVI entity.
  • the first The NFVI entity is any one of the NFVI entities, the main operating system Host OS, the virtual machine manager, or the virtual machine VM entity, and the VIM generates the first fault comprehensive information according to the first fault information, including: determining The fault information sent by the NFVI entity associated with the first NFVI entity is associated fault information of the first fault information; and the first fault comprehensive information including the first fault information and the associated fault information is generated.
  • the VIM performs fault repairing or reporting processing according to the first fault comprehensive information, including: according to the first fault comprehensive information Determining whether the VIM includes a fault repairing strategy corresponding to a fault type of the first fault information or a fault type of the associated fault information, and a fault type of the fault information or a fault type of the associated fault information; Resolving the first NFVI entity and/or the first NFVI entity according to the fault repair policy when the VIM includes a fault repair policy corresponding to the fault type of the first fault information or the fault type of the associated fault information a failure of an NFVI entity associated with an NFVI entity; or transmitting the said VNFM to the VNFM when the VIM does not include a failure repair policy corresponding to the failure type of the first failure information or the failure type of the associated failure information
  • the first fault comprehensive information or the first fault comprehensive information is sent to the arranger.
  • the priority of the HW is higher than the priority of the Host OS.
  • the priority of the Host OS is higher than the priority of the virtual machine manager.
  • the priority of the virtual machine manager is higher than that of the VM.
  • the VIM includes a corresponding fault repair policy
  • the VIM includes a fault corresponding to a fault type of the highest priority NFVI entity
  • the method further includes: sending a success indication message to the orchestrator when the fault repair is successful; sending the first to the VNFM when the fault repair fails The fault comprehensive information or the first fault comprehensive information is sent to the arranger.
  • the method further includes: receiving an indication message sent by the VNFM to indicate that the VNFM cannot process the first fault comprehensive information; and sending the first fault comprehensive information to the orchestrator .
  • the method before the sending, by the orchestrator, the first fault comprehensive information, the method further includes: requesting, by the VNFM, the first NFVI entity The fault information of the VNF entity is added to the first fault comprehensive information by the fault information of the VNF entity associated with the first NFVI entity.
  • the method further includes: receiving request information sent by the VNFM, where the request information is used to request the VIM from the VIM and the failed VNF. Failure information of the NFVI entity associated with the entity; transmitting the failure information of the NFVI entity associated with the failed VNF entity to the VNFM.
  • the method further includes: detecting, according to the first fault comprehensive information detecting Whether the VIM includes the same fault comprehensive information as the first fault comprehensive information; and when the VIM includes the same fault comprehensive information as the first fault comprehensive information, deleting the first fault comprehensive information.
  • the first fault information is also used to report to the operation and service support system OSS/BSS, so that the OSS/BSS monitors and presents The first fault information.
  • the first fault information further includes at least one of the following: an operating state, a fault time; and the first fault comprehensive information further includes a fault state.
  • the fault status includes at least one of unprocessed, processed, repaired, and unrepaired.
  • a fault management method including: a virtual network function manager VNFM acquiring a second fault information including a fault entity identifier and a fault type of the virtual network function VNF entity, where the second fault information is used to indicate that The first VNF entity identified by the fault entity is faulty; the VNFM generates second fault comprehensive information according to the second fault information; and the VNFM performs fault repair or report processing according to the second fault comprehensive information.
  • the VNFM acquires a VNF entity
  • the second fault information that includes the faulty entity identifier and the fault type includes: receiving the second fault information sent by the first VNF entity; or determining that the first VNF entity is faulty, and according to the first VNF entity
  • the generated fault generates the second fault information.
  • the VNFM generates second fault comprehensive information according to the second fault information, including: determining a VNF associated with the first VNF entity
  • the fault information sent by the entity is the associated fault information of the second fault information; and the second fault comprehensive information including the second fault information and the associated fault information is generated.
  • the VNFM performs fault repairing or reporting processing according to the second fault comprehensive information, including: according to the second fault comprehensive information Determining whether the VNFM includes a fault repairing strategy corresponding to the fault type of the second fault information or the fault type of the associated fault information; When the VNFM includes a fault repair policy corresponding to the fault type of the second fault information or the fault type of the associated fault information, repairing the first VNF entity and/or the first according to the fault repair policy a failure of a VNF entity associated with a VNF entity; or transmitting to the orchestrator when the VNFM does not include a failure repair policy corresponding to the failure type of the second failure information or the failure type of the associated failure information
  • the second fault comprehensive information is described.
  • the method further includes: sending a success indication message to the orchestrator when the fault repair is successful; and sending the second fault comprehensive information to the orchestrator when the fault repair fails.
  • the method before the sending the second fault comprehensive information to the orchestrator, the method further includes: requesting, by the virtualized infrastructure manager VIM The fault information of the NFVI entity associated with the first VNF entity, where the NFVI entity is any one of the NFVI hardware HW, the main operating system Host OS, the virtual machine manager, or the virtual machine VM entity; The fault information of the NFVI entity associated with the first VNF entity is added to the second fault comprehensive information.
  • the method further includes: receiving, by the VIM, first fault comprehensive information, where the first fault comprehensive information includes Determining the first fault information and the associated fault information of the first fault information, the first fault information is used to indicate that the first NFVI entity is faulty; and determining whether the VNFM includes the first information in the first fault comprehensive information a fault repair type corresponding to the fault type of the fault information or the fault type of the associated fault information; the fault in the VNFM corresponding to the fault type of the first fault information or the fault type of the associated fault information Remediating a policy, repairing a failure of the first NFVI entity and/or an NFVI entity associated with the first NFVI entity according to the failure repair policy; or not including the first failure information in the VNFM Transmitting the first fault comprehensive information to the orchestrator, or sending the VIM to indicate that the VNFM cannot process the first fault, when the fault type or the fault repair policy corresponding to the fault type of the fault
  • the method further includes: determining, according to the first fault comprehensive information, the first NFVI The failure information of the entity and/or the first VNF entity associated with the NFVI entity associated with the first NFVI entity; adding fault information of the first VNF entity to the first fault synthesis information, to facilitate The VNFM performs repair or report processing on the first fault comprehensive information.
  • the method further includes: detecting, according to the second fault comprehensive information detecting manner Whether the VNFM includes the same fault comprehensive information as the second fault comprehensive information; and when the VNFM includes the same fault comprehensive information as the second fault comprehensive information, deleting the second fault comprehensive information.
  • the method further includes: receiving request information sent by the VIM, where the request information is used to request the NFVI from the VNFM and the faulty NFVI Failure information of the VNF entity associated with the entity; transmitting, to the VIM, the failure information of the VNF entity associated with the failed NFVI entity.
  • the second fault information is further used to report to the operation and service support system OSS/BSS, so that the OSS/BSS monitors and presents The second fault information.
  • the second fault information further includes at least one of the following: an operating state, a fault time, and the second fault comprehensive information.
  • the information also includes fault status information including at least one of unprocessed, processed, repaired, and unrepaired.
  • the third aspect provides a fault management method, including: the orchestrator receives the first fault comprehensive information sent by the virtualized infrastructure manager VIM, where the first fault comprehensive information includes first fault information, where The fault information includes a fault entity identifier and a fault type, where the first fault information is used to indicate that the first network function virtualization infrastructure NFVI entity having the fault entity identifier fails; the orchestrator according to the first fault Comprehensive information for fault repair or processing.
  • the first fault comprehensive information further includes: fault information of an NFVI entity associated with the first NFVI entity; and/or with the first NFVI entity The fault information of the associated virtual network function VNF entity.
  • the orchestration device performs fault repairing or reporting processing according to the first fault comprehensive information, including: according to the first fault comprehensive information a fault type, determining whether the orchestrator includes a fault repair policy corresponding to the fault type; and when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the fault according to the fault repair policy Failure of the first NFVI entity and/or the NFVI entity associated with the first NFVI entity; or to the operational and business support system OSS when the orchestrator does not include a fault repair strategy corresponding to the fault type /BSS sends the first fault comprehensive information.
  • the orchestration device performs fault repairing or reporting processing according to the first fault comprehensive information, including: according to the first fault comprehensive information a fault type, determining whether the orchestrator includes a fault repair policy corresponding to the fault type; and when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the fault according to the fault repair policy a failure of the first NFVI entity and the NFVI entity associated with the first NFVI entity and a failure of the VNF entity associated with the first NFVI entity; or the correlator does not include a type corresponding to the fault type
  • the first fault comprehensive information is sent to the OSS/BSS.
  • the orchestration apparatus before the faulty repairing or reporting processing, further includes: according to the first fault comprehensive information Detecting whether the orchestrator includes the same fault comprehensive information as the first fault comprehensive information; and the orchestrator includes the same fault as the first fault comprehensive information When the information is integrated, the first fault comprehensive information is deleted.
  • the first fault information further includes at least one of the following: an operating state, a fault time; the first fault comprehensive information further includes fault state information.
  • the fault state includes at least one of unprocessed, processed, repaired, and unrepaired.
  • the fourth aspect provides a fault management method, including: the orchestrator receives the second fault comprehensive information sent by the virtual network function manager VNFM, where the second fault comprehensive information includes second fault information, and the second The fault information includes a fault entity identifier and a fault type, where the second fault information is used to indicate that the first virtual network function VNF entity having the fault entity identifier is faulty; the orchestrator performs fault according to the second fault comprehensive information. Repair or escalation processing.
  • the second fault comprehensive information further includes: fault information of a VNF entity associated with the first VNF entity; and/or with the first VNF entity
  • the associated virtualization infrastructure manages fault information for the NFVI entity.
  • the orchestration device performs fault repairing or reporting processing according to the second fault comprehensive information, including: according to the second fault comprehensive information a fault type, determining whether the orchestrator includes a fault repair policy corresponding to the fault type; and when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the fault according to the fault repair policy Failure of the first VNF entity and/or the VNF entity associated with the first VNF entity; or to the operational and business support system OSS when the orchestrator does not include a fault repair strategy corresponding to the fault type /BSS sends the second fault comprehensive information.
  • the orchestration device performs fault repairing or reporting processing according to the second fault comprehensive information, including: according to the second fault comprehensive information a fault type, determining whether the orchestrator includes a fault repair policy corresponding to the fault type; and when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the fault according to the fault repair policy Failure of the first VNF entity and the VNF entity associated with the first VNF entity and a failure of the NFVI entity associated with the first VNF entity; or the correlator does not include a type corresponding to the fault type
  • the fault repair strategy sends the second fault comprehensive information to the OSS/BSS.
  • the method further includes: detecting, according to the second fault comprehensive information, whether the orchestrator includes fault comprehensive information that is the same as the second fault comprehensive information; When the orchestrator includes the same fault comprehensive information as the second fault comprehensive information, the second fault comprehensive information is deleted.
  • the second fault information further includes at least one of the following: an operating state, a fault time, and the second fault comprehensive information further includes fault state information.
  • the fault state includes at least one of unprocessed, processed, repaired, and unrepaired.
  • the fifth aspect provides a virtualization infrastructure manager, including: an acquiring unit, configured to acquire, by the network function virtualization infrastructure NFVI entity, first fault information including a fault entity identifier and a fault type, where the first fault is The information is used to indicate that the first NFVI entity having the fault entity identifier is faulty, and the generating unit is configured to generate first fault comprehensive information according to the first fault information, where the first fault comprehensive information includes the first fault The information and the associated fault information of the first fault information; the processing unit, configured to perform fault repair or report processing according to the first fault comprehensive information.
  • the manager further includes a determining unit and a receiving unit, where the acquiring unit is specifically configured to: receive, by the receiving unit, the sending by the first NFVI entity And determining, by the determining unit, that the first NFVI entity is faulty, and generating the first fault information according to a fault that occurs in the first NFVI entity.
  • the first NFVI entity is any one of the NFVI entities, a host HW, a host OS, a virtual machine manager, or a virtual machine.
  • the generating unit is configured to: determine, by the determining unit, that the fault information sent by the NFVI entity associated with the first NFVI entity is the associated fault information of the first fault information; The first fault information and the first fault comprehensive information of the associated fault information.
  • the processing unit includes a sending unit, where the processing unit is specifically configured to: according to the first fault information in the first fault comprehensive information a fault type or a fault type of the associated fault information, by the determining unit determining whether the VIM includes a fault repairing strategy corresponding to a fault type of the first fault information or a fault type of the associated fault information;
  • the VIM includes the first Repairing the first NFVI entity and/or the NFVI entity associated with the first NFVI entity according to the fault repair policy when the fault type of the fault information or the fault repair policy corresponding to the fault type of the associated fault information
  • the sending by the sending unit, the first fault to the VNFM, when the VIM does not include a fault repairing policy corresponding to the fault type of the first fault information or the fault type of the associated fault information Synthesizing information or transmitting the first fault comprehensive information to the arranger.
  • the processing unit is specifically configured to: associate, by the determining unit, the first NFVI entity and the first NFVI entity
  • the NFVI entity determines the highest priority NFVI entity.
  • the priority of the HW is higher than the priority of the Host OS.
  • the priority of the Host OS is higher than the priority of the virtual machine manager.
  • the priority of the virtual machine manager is higher than that of the VM.
  • the VIM includes a corresponding fault repair policy by the determining unit; and the VIM includes the NFVI entity with the highest priority
  • the fault type corresponds to the fault repair strategy
  • the fault of the highest priority NFVI entity is repaired according to the fault repair strategy.
  • the sending unit is specifically configured to: when the fault repair succeeds, send a success indication message to the orchestrator; Upon failure, the first fault comprehensive information is sent to the VNFM or the first fault comprehensive information is sent to the orchestrator.
  • the receiving unit is further configured to: receive an indication sent by the VNFM to indicate that the VNFM cannot process the first fault comprehensive information
  • the sending unit is further configured to: send the first fault comprehensive information to the orchestrator.
  • the processing unit is further configured to: request, by the VNFM, fault information of a VNF entity associated with the first NFVI entity;
  • the fault information of the VNF entity associated with the first NFVI entity is added to the first fault comprehensive information.
  • the receiving unit is further configured to: receive request information sent by the VNFM, where the request information is used to request and fail the VIM.
  • the fault information of the NFVI entity associated with the VNF entity; the sending unit is further configured to send the NFVI associated with the failed VNF entity to the VNFM Body fault information.
  • the manager further includes a detecting unit and a deleting unit, where the detecting unit is specifically configured to: detect, according to the first fault comprehensive information, Whether the VIM includes the same fault comprehensive information as the first fault comprehensive information; the deleting unit is specifically configured to delete the first fault when the VIM includes the same fault comprehensive information as the first fault comprehensive information General Information.
  • a virtual network function manager including: an acquiring unit, configured to acquire second fault information including a fault entity identifier and a fault type of the virtual network function VNF entity, where the second fault information is used to indicate The first VNF entity having the faulty entity identifier is faulty; the generating unit is configured to generate the second fault comprehensive information according to the second fault information, and the processing unit is configured to perform fault repair according to the second fault comprehensive information or Report processing.
  • the manager further includes a determining unit and a receiving unit, where the acquiring unit is specifically configured to: receive, by the receiving unit, the sending by the first VNF entity And determining, by the determining unit, that the first VNF entity is faulty, and generating, by the generating unit, the second fault information according to a fault that occurs in the first VNF entity.
  • the generating unit is specifically configured to: determine, by using the determining unit, that the fault information sent by the VNF entity associated with the first VNF entity is Correlating fault information of the second fault information; generating second fault comprehensive information including the second fault information and the associated fault information.
  • the processing unit includes a sending unit, where the processing unit is specifically configured to: according to the second fault information in the second fault comprehensive information a fault type or a fault type of the associated fault information, by the determining unit determining whether the VNFM includes a fault repairing strategy corresponding to a fault type of the second fault information or a fault type of the associated fault information; When the VNFM includes a fault repair policy corresponding to the fault type of the second fault information or the fault type of the associated fault information, repairing the first VNF entity and/or the a failure of the VNF entity associated with the first VNF entity; or when the VNFM does not include a failure repair policy corresponding to the failure type of the second failure information or the failure type of the associated failure information The unit transmits the second fault comprehensive information to the arranger.
  • the sending the The element is specifically configured to: when the fault repair succeeds, send a success indication message to the orchestrator; and when the fault repair fails, send the second fault comprehensive information to the orchestrator.
  • the processing unit is further configured to: request, by the virtualization infrastructure manager VIM, a fault of the NFVI entity associated with the first VNF entity.
  • Information wherein the NFVI entity is any one of the NFVI hardware HW, a primary operating system Host OS, a virtual machine manager, or a virtual machine VM entity; the NFVI entity associated with the first VNF entity
  • the fault information is added to the second fault comprehensive information.
  • the processing unit is further configured to: receive the first fault comprehensive information sent by the VIM, where the first fault comprehensive information includes the first fault Information and the associated fault information of the first fault information, the first fault information is used to indicate that the first NFVI entity is faulty; and determining whether the VNFM includes the first fault information in the first fault comprehensive information a fault repairing strategy corresponding to the fault type or the fault type of the associated fault information; when the VNFM includes a fault repairing strategy corresponding to the fault type of the first fault information or the fault type of the associated fault information, Repairing a failure of the first NFVI entity and/or an NFVI entity associated with the first NFVI entity according to the fault repair policy; or not including a fault type or location of the first fault information at the VNFM Transmitting the first fault comprehensive message to the orchestrator when the fault repair strategy corresponding to the fault type of the fault information is described , Or a combination of the information to VIM sends a
  • the processing unit is further configured to: determine, according to the first fault comprehensive information, the first NFVI entity and/or the Fault information of the first VNF entity associated with the NFVI entity associated with the first NFVI entity; adding fault information of the first VNF entity to the first fault synthesis information to facilitate the VNFM pair
  • the first fault comprehensive information is repaired or reported.
  • the manager further includes a detecting unit and a deleting unit, where the detecting unit is specifically configured to: detect, according to the second fault comprehensive information, Whether the VNFM includes the same fault comprehensive information as the second fault comprehensive information; the deleting unit is specifically configured to delete the second fault when the VNFM includes the same fault comprehensive information as the second fault comprehensive information General Information.
  • the receiving unit is further configured to: receive request information sent by the VIM, where the request information is used to request and fail the VNFM.
  • the fault information of the VNF entity associated with the NFVI entity; the sending unit is further configured to: send, to the VIM, the fault information of the VNF entity associated with the failed NFVI entity.
  • an apparatus including: a receiving unit, configured to receive first fault comprehensive information sent by a virtualized infrastructure manager VIM, where the first fault comprehensive information includes first fault information, where The first fault information includes a fault entity identifier and a fault type, where the first fault information is used to indicate that the first network function virtualization infrastructure NFVI entity having the fault entity identifier is faulty; The first fault comprehensive information is used for fault repair or report processing.
  • the first fault comprehensive information further includes: fault information of an NFVI entity associated with the first NFVI entity; and/or with the first NFVI entity The fault information of the associated virtual network function VNF entity.
  • the processing unit is specifically configured to: determine, according to the fault type in the first fault comprehensive information, whether the orchestrator includes the a fault repair strategy corresponding to the fault type; when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the first NFVI entity and/or with the first NFVI according to the fault repair policy The failure of the NFVI entity associated with the entity; or transmitting the first failure comprehensive information to the operations and service support system OSS/BSS when the orchestrator does not include a fault repair policy corresponding to the fault type.
  • the processing unit is specifically configured to: determine, according to a fault type in the first fault comprehensive information, whether the orchestrator includes the a fault repair strategy corresponding to the fault type; when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the first NFVI entity and relating to the first NFVI entity according to the fault repair policy The failure of the associated NFVI entity and the failure of the VNF entity associated with the first NFVI entity; or the transmission to the OSS/BSS when the orchestrator does not include a failure repair policy corresponding to the failure type The first fault comprehensive information.
  • the orchestrator further includes a detecting unit and a deleting unit, where the detecting unit is configured to: detect the orchestration according to the first fault comprehensive information Whether the device contains the same fault comprehensive information as the first fault comprehensive information;
  • the deleting unit is configured to delete the first fault comprehensive information when the orchestrator includes the same fault comprehensive information as the first fault comprehensive information.
  • an apparatus including: a receiving unit, configured to receive second fault comprehensive information sent by a virtual network function manager VNFM, where the second fault comprehensive information includes second fault information,
  • the second fault information includes a fault entity identifier and a fault type, where the second fault information is used to indicate that the first virtual network function VNF entity having the fault entity identifier is faulty;
  • the processing unit is configured to perform, according to the second fault, The information is repaired or reported.
  • the second fault comprehensive information further includes: fault information of a VNF entity associated with the first VNF entity; and/or with the first VNF entity
  • the associated virtualization infrastructure manages fault information for the NFVI entity.
  • the processing unit is specifically configured to: determine, according to a fault type in the second fault comprehensive information, whether the orchestrator includes the a fault repair strategy corresponding to the fault type; repairing the first VNF entity and/or the first VNF according to the fault repair policy when the orchestrator includes a fault repair policy corresponding to the fault type The failure of the entity-associated VNF entity; or the second fault comprehensive information is sent to the operations and service support system OSS/BSS when the orchestrator does not include a fault repair policy corresponding to the fault type.
  • the processing unit is specifically configured to: determine, according to a fault type in the second fault comprehensive information, whether the orchestrator includes the a fault repair strategy corresponding to the fault type; when the orchestrator includes a fault repair policy corresponding to the fault type, repairing the first VNF entity according to the fault repair policy and relating to the first VNF entity a failure of the associated VNF entity and a failure of the NFVI entity associated with the first VNF entity; or transmitting the to the OSS/BSS when the orchestrator does not include a failure repair policy corresponding to the failure type Second fault comprehensive information.
  • the orchestrator further includes a detecting unit and a deleting unit, where the detecting unit is configured to: detect the orchestration according to the second fault comprehensive information Whether the device includes the same fault comprehensive information as the second fault comprehensive information; the deleting unit is configured to delete the second fault when the orchestrator includes the same fault comprehensive information as the second fault comprehensive information General Information.
  • Embodiments of the present invention provide a fault management method for acquiring hardware through VIM and VNFM. And the fault information of the software entity and the fault information of the associated relationship are comprehensively processed, so that fault reporting and processing in the NFV environment can be realized.
  • FIG. 1 is a system architecture diagram of a network function virtualization NFV of the present invention.
  • FIG. 2 is a flow chart of a method of fault management in accordance with one embodiment of the present invention.
  • FIG. 3 is a flow chart of a method of fault management in accordance with one embodiment of the present invention.
  • FIG. 4 is a flow chart of a method of fault management in accordance with one embodiment of the present invention.
  • Figure 5 is a flow diagram of a method of fault management in accordance with one embodiment of the present invention.
  • Figure 6a is an interaction diagram of a method of fault management in accordance with one embodiment of the present invention.
  • Figure 6b is a schematic diagram of the association relationship between entities in an embodiment of the present invention.
  • FIG. 7 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • FIG. 8 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • FIG. 9 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • FIG. 10 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • FIG 11 is a schematic block diagram of a virtualized infrastructure management VIM entity in accordance with one embodiment of the present invention.
  • FIG. 12 is a schematic block diagram of a virtual network function management VNFM entity in accordance with one embodiment of the present invention.
  • Figure 13 is a schematic block diagram of an orchestrator Orchestrator entity in accordance with one embodiment of the present invention.
  • Figure 14 is a schematic block diagram of a VIM entity in accordance with another embodiment of the present invention.
  • FIG. 15 is a schematic block diagram of a VNFM entity in accordance with another embodiment of the present invention.
  • 16 is a schematic block diagram of an Orchestrator entity in accordance with another embodiment of the present invention. detailed description
  • FIG. 1 is a system architecture diagram of a network function virtualization NFV of the present invention.
  • the Network Function Virtualization Infrastructure contains the underlying hardware (HW) resources. Can be divided into computing hardware, storage hardware, network hardware, and so on. Above the hardware layer is the Virtualization Layer, including the Host Operating System (Host OS) and the hypervisor/virtual hypervisor. There are multiple virtual machines running on top of the virtualization layer. (Virtual Machine, VM). The HW and Hypervisor are connected to the Operation and Business Support System (OSS/BSS) through an Element Management System (EMS). On top of NFVI, multiple Virtual Network Function (VNF) instances are connected to OSS/BSS via vEMS.
  • HW hardware
  • HW Host Operating System
  • EMS Element Management System
  • the NFVI is connected to the Virtualization Infrastructure Manager (VIM) through the Nf-Vi interface.
  • the VNF is connected to the VNF Manager (VNFM) through the Ve-Vnfm interface, and the VIM and VNFM are connected through the Vi-Vnfm interface.
  • NFVI is connected to orchestrator Orchestrator via Or-Vi
  • VNFM is connected to Orchestrator via Or-Vnfm
  • Orchestrator is connected to OSS/BSS via Os-Ma interface.
  • OSS/BSS is used to initiate service requests to Orchestrator.
  • Orchestrator is responsible for orchestrating management resources, implementing NFV services, and detecting VNF, NFVI resources and operational status information in real time based on OSS/BSS service requests.
  • the VNFM is responsible for VNF lifecycle management, such as startup, time to live, and detection of the operational status of the VNF.
  • VIM is responsible for managing and allocating NFVI resources and detecting the collection of NFVI operational status information.
  • FIG. 2 is a flow chart of a method of fault management in accordance with one embodiment of the present invention. The method of Figure 2 is performed by VIM.
  • the virtualized infrastructure manager VIM obtains first fault information of the network function virtualization infrastructure NFVI entity, including the fault entity identifier and the fault type, where the first fault information is used to indicate that the first NFVI entity with the fault entity identifier is faulty. .
  • the VIM generates first fault comprehensive information according to the first fault information, where the first fault comprehensive information includes the first fault information and the associated fault information of the first fault information.
  • the VIM performs fault repair or report processing according to the first fault comprehensive information.
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the step 201 includes: receiving the first fault information sent by the first NFVI entity; or determining that the first NFVI entity is faulty, and generating the first fault information according to the fault that occurs in the first NFVI entity. That is to say, VIM can passively accept the fault information of the faulty entity, and can also actively generate fault information after detecting the fault.
  • the first NFVI entity is any one of the NFVI entities, the host operating system Host OS, the virtual machine manager, or the virtual machine VM entity
  • the step 202 includes: determining, related to the first NFVI entity.
  • the fault information sent by the associated NFVI entity is the associated fault information of the first fault information; and the first fault comprehensive information including the first fault information and the associated fault information is generated. Due to the association between some HWs, Host OS, Hypervisor and VM entities, other NFVI entities that may be associated with the first NFVI also fail when the first NFVI entity fails. VIM can collect all relevant fault information for uniform integration.
  • the step 203 includes: determining, according to the fault type of the first fault information in the first fault comprehensive information or the fault type of the associated fault information, whether the VIM includes the fault type of the first fault information or A fault repair strategy corresponding to the fault type associated with the fault information; when the VIM includes a fault repair policy corresponding to the fault type of the first fault information or the fault type associated with the fault information, repairing the first NFVI entity and/or according to the fault repair strategy Or a fault of the NFVI entity associated with the first NFVI entity; or transmitting the first fault comprehensive information to the VNFM when the VIM does not include a fault repair policy corresponding to the fault type of the first fault information or the fault type associated with the fault information Or send the first fault comprehensive information to the arranger.
  • the VIM after generating the fault comprehensive information, the VIM first needs to determine whether the VIM local can process the fault comprehensive information. If it can be processed, repair one of the NFVI entities involved in the fault comprehensive information. If it cannot be processed or the repair fails, it will be reported.
  • the type corresponding fault repair strategy includes: determining the highest priority among the first NFVI entity and the NFVI entity associated with the first NFVI entity A high NFVI entity, where the priority of the HW is higher than the priority of the Host OS, the priority of the Host OS is higher than the priority of the virtual machine manager, and the priority of the virtual machine manager is higher than the priority of the VM;
  • the fault type of the highest-level NFVI entity determines whether the VIM contains the corresponding fault repair strategy; when the VIM contains the fault repair strategy corresponding to the fault type of the highest priority NFVI entity, the highest priority NFVI is repaired according to the fault repair strategy. The failure of the entity.
  • the method may further include: sending the successor to the orchestrator when the fault repair succeeds The indication message; when the failure repair fails, the first fault comprehensive information is sent to the VNFM or the first fault comprehensive information is sent to the orchestrator.
  • the success indication message may be fault information with the running status set to "normal", or other forms of information indicating that the repair is successful. The invention is not limited thereto.
  • the method further includes: receiving an indication message sent by the VNFM to indicate that the VNFM cannot process the first fault comprehensive information; and sending the first fault comprehensive information to the orchestrator. . If VIM cannot process the first fault comprehensive information and report it to VNFM, if VNFM cannot process it, VNFM will continue to report the first fault comprehensive information to Orchestrator.
  • the method before sending the first fault comprehensive information to the orchestrator, the method further includes: requesting, by the VNFM, fault information of the VNF entity associated with the first NFVI entity; and the VNF associated with the first NFVI entity The fault information of the entity is added to the first fault comprehensive information.
  • the VNFM may be requested to obtain the fault information of the VNF entity associated with the faulty NFVI entity, and the report is comprehensively reported, so that the upper management entity can perform comprehensive processing. .
  • the method further includes: receiving request information sent by the VNFM, where the request information is used to request, by the VIM, fault information of the NFVI entity associated with the failed VNF entity; sending and failing to the VNFM The fault information of the NFVI entity associated with the VNF entity.
  • the VNFM cannot process the fault comprehensive information of the VNF entity, the related NFVI fault information can be requested from the VIM and reported comprehensively, so that the upper management entity can perform comprehensive processing.
  • the method further includes: detecting, according to the first fault comprehensive information, whether the VIM includes the same fault comprehensive information as the first fault comprehensive information; Contains the same fault as the first fault comprehensive information When synthesizing information, the first fault comprehensive information is deleted.
  • VIM obtains multiple identical fault comprehensive information.
  • the same here refers to the same fault information content in the fault comprehensive information.
  • VIM Repeatable alarm detection is possible.
  • the fault comprehensive information that is being processed continues to be processed, and the same fault comprehensive information that is not processed is deleted.
  • the first fault information is further used to report to the operation and service support system OSS/BSS, so that the OSS/BSS monitors and presents the first fault information.
  • the first fault information further includes at least one of the following: an operating state, a fault time; the first fault comprehensive information further includes fault state information, where the fault state includes unprocessed, in process, and has been recovered. At least one of the unresolved.
  • the fault management method provided by the embodiment of the present invention acquires the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault handling are improved.
  • FIG. 3 is a flow chart of a method of fault management in accordance with one embodiment of the present invention. The method of Figure 3 is performed by VNFM.
  • the virtual network function manager VNFM obtains the virtual network function.
  • the VNF entity includes the second fault information of the fault entity identifier and the fault type, and the second fault information is used to indicate that the first VNF entity with the fault entity identifier is faulty.
  • the VNFM generates second fault comprehensive information according to the second fault information.
  • the VNFM performs fault repair or report processing according to the second fault comprehensive information.
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the VNFM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the step 301 includes: receiving a second fault information sent by the first VNF entity, or determining that the first VNF entity is faulty, and generating the second fault information according to the fault that occurs by the first VNF entity. That is to say, the VNFM can passively accept the fault information of the faulty entity, and can also actively generate fault information after detecting the fault.
  • step 302 includes: determining that the fault information sent by the VNF entity associated with the first VNF entity is associated fault information of the second fault information; The second fault information and the second fault comprehensive information associated with the fault information. Since VNF entities may have an association relationship, when the first VNF entity fails, other VNF entities that may be associated with the first VNF also fail.
  • the VNFM can collect all relevant fault information for uniform and comprehensive processing.
  • the step 303 includes: determining, according to the fault type of the second fault information in the second fault comprehensive information or the fault type of the associated fault information, whether the VNFM includes a fault type or association with the second fault information.
  • a fault repair strategy corresponding to the fault type of the fault information when the VNFM includes a fault repair strategy corresponding to the fault type of the second fault information or the fault type associated with the fault information, repairing the first VNF entity and/or according to the fault repair strategy a failure of the VNF entity associated with the first VNF entity; or transmitting the second failure comprehensive information to the orchestrator when the VNFM does not include a failure repair policy corresponding to the failure type of the second failure information or the failure type of the associated failure information .
  • the VNFM first needs to determine whether the VNFM can process the fault comprehensive information locally. If it can be processed, repair one of the VNF entities involved in the fault comprehensive information. If it cannot be processed or the repair fails, it will be reported.
  • the method further includes: sending a success indication to the orchestrator when the fault repair succeeds Message; when the fault repair fails, the second fault comprehensive information is sent to the orchestrator.
  • the success indication message may be fault information with the running status set to "normal", or may be other forms of information indicating that the repair is successful. The invention is not limited thereto.
  • the method before sending the second fault comprehensive information to the orchestrator, the method further includes: requesting, by the virtualization infrastructure manager VIM, fault information of the NFVI entity associated with the first VNF entity, where the NFVI entity is Any one of the hardware HW, the main operating system Host OS, the virtual machine manager or the virtual machine VM entity in the NFVI; adding the fault information of the NFVI entity associated with the first VNF entity to the second fault comprehensive information.
  • the VIM may initiate a request to obtain the fault information of the NFVI entity associated with the failed VNF entity, and comprehensively report, so that the upper management entity can perform comprehensive processing. .
  • the method further includes: receiving first fault comprehensive information sent by the VIM, where the first fault comprehensive information includes first fault information and associated fault information of the first fault information, where the first fault information is used. Indicates that the first NFVI entity has failed; determine if the VNFM is packaged a fault repair strategy including a fault type of the first fault information in the first fault comprehensive information or a fault type associated with the fault information; the VNFM includes a fault type of the first fault information or a fault type of the associated fault information The fault repair strategy, repairing the fault of the first NFVI entity and/or the NFVI entity associated with the first NFVI entity according to the fault repair strategy; or failing to include the fault type of the first fault information or the fault information associated with the first fault information in the VNFM When the type corresponding to the fault repairing strategy, the first fault comprehensive information is sent to the orchestrator, or an indication message indicating that the VNFM cannot process the first fault comprehensive information is sent to the VIM, so that the VIM sends the first fault
  • the VIM cannot process the first fault comprehensive information of the NFVI entity, or the repair fails, the first fault comprehensive information is reported to the VNFM. If the VNFM cannot be processed or the repair fails, the VIM is notified, so that the first fault comprehensive information is obtained by the VIM. Reported to Orchestrator.
  • the method further includes: determining, according to the first fault synthesis information, the association with the first NFVI entity and/or the NFVI entity associated with the first NFVI entity.
  • the fault information of the first VNF entity is added to the fault information of the first VNF entity to the first fault comprehensive information, so that the VNFM repairs or reports the first fault comprehensive information.
  • the method further includes: detecting, according to the second fault comprehensive information, whether the VNFM includes the same fault comprehensive information as the second fault comprehensive information; When the fault comprehensive information is the same as the second fault comprehensive information, the second fault comprehensive information is deleted.
  • the VNFM obtains multiple identical fault comprehensive information, where the same refers to the same fault information content in the fault comprehensive information. At this time, VNFM Repeatable alarm detection is possible.
  • the fault comprehensive information that is being processed continues to be processed, and the same fault comprehensive information that has not been processed is deleted.
  • the method further includes: receiving request information sent by the VIM, where the request information is used to request, from the VNFM, fault information of the VNF entity associated with the failed NFVI entity; sending the failed NFVI to the VIM Fault information of the VNF entity associated with the entity.
  • the second fault information is further used to report to the operation and service support system OSS/BSS, so that the OSS/BSS monitors and presents the second fault information.
  • the second fault information further includes at least one of the following: State, failure time;
  • the second fault comprehensive information further includes fault state information, and the fault state includes at least one of unprocessed, processed, expired, and failed.
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the VNFM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault handling are improved.
  • FIG. 4 is a flow chart of a method of fault management in accordance with one embodiment of the present invention. The method in Figure 4 is performed by Orchestrator.
  • the orchestrator receives the first fault comprehensive information sent by the virtualized infrastructure manager VIM, where the first fault comprehensive information includes the first fault information, where the first fault information includes the fault entity identifier and the fault type, and the first fault information is used.
  • the first network function virtualization infrastructure NFVI entity indicating that there is a faulty entity identifier fails.
  • the orchestrator performs fault repair or report processing according to the first fault comprehensive information.
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the Orchestrator, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the first fault comprehensive information further includes: fault information of the NFVI entity associated with the first NFVI entity; and/or fault information of the virtual network function VNF entity associated with the first NFVI entity .
  • the fault comprehensive information obtained by Orchestrator from VIM can contain the fault information of the NFVI entity, and can also include the NFVI entity and related
  • the step 402 includes: determining, according to the type of the fault in the first fault comprehensive information, whether the orchestrator includes a fault repairing strategy corresponding to the fault type; and the orchestrator includes a fault corresponding to the fault type.
  • the step 402 includes: determining, according to the type of the fault in the first fault comprehensive information, whether the orchestrator includes a fault repairing strategy corresponding to the fault type; and the orchestrator includes a fault corresponding to the fault type.
  • 0SS/BSS sends the first fault comprehensive information.
  • the step 402 includes: determining, according to the type of the fault in the first fault comprehensive information, whether the orchestrator includes a fault repairing strategy corresponding to the fault type; and the orchestrator includes a fault corresponding to the fault type.
  • the orchestrator includes a fault corresponding to the fault type.
  • the method further includes: detecting, according to the first fault comprehensive information, whether the orchestrator includes the same fault comprehensive information as the first fault comprehensive information; and the orchestrator includes the same as the first fault comprehensive information.
  • the first fault comprehensive information is deleted.
  • the Orchestrator obtains multiple identical fault comprehensive information, where the same refers to the same fault information content in the fault comprehensive information. Orchestrator can perform repeated alarm detection. The fault comprehensive information that is being processed continues to be processed, and the same fault comprehensive information that has not been processed is deleted.
  • the first fault information further includes at least one of the following: an operating state, a fault time; the first fault comprehensive information further includes fault state information, where the fault state includes unprocessed, in process, and has been recovered. At least one of the unresolved.
  • the fault management method provided by the embodiment of the present invention receives the fault comprehensive information reported by the VIM through the Orchestrator, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault handling are improved.
  • Figure 5 is a flow diagram of a method of fault management in accordance with one embodiment of the present invention.
  • the method of Figure 5 consists of
  • the orchestrator receives the second fault comprehensive information sent by the virtual network function manager VNFM, where the second fault comprehensive information includes the second fault information, the second fault information includes the fault entity identifier and the fault type, and the second fault information is used.
  • a failure is indicated for the first virtual network function VNF entity with the faulty entity identity.
  • the orchestrator performs fault repair or report processing according to the second fault comprehensive information.
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the Orchestrator, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the second fault comprehensive information further includes: fault information of the VNF entity associated with the first VNF entity; and/or a virtualization base associated with the first VNF entity Infrastructure management fault information for NFVI entities. That is to say, the fault comprehensive information obtained by the Orchestrator from the VNFM may include fault information of the NFVI entity, may include fault information of the VNF entity, and may also include fault information of the NFVI entity and the related VNF entity.
  • step 502 includes: determining, according to the type of the fault in the second fault comprehensive information, whether the orchestrator includes a fault repairing strategy corresponding to the fault type; and the orchestrator includes a fault corresponding to the fault type
  • the policy is repaired, the fault of the first VNF entity and/or the VNF entity associated with the first VNF entity is repaired according to the fault repair policy; or when the orchestrator does not include the fault repair strategy corresponding to the fault type, to the operation and service
  • the support system OSS/BSS transmits the second fault comprehensive information.
  • step 502 includes: determining, according to the type of the fault in the second fault comprehensive information, whether the orchestrator includes a fault repairing strategy corresponding to the fault type; and the orchestrator includes a fault corresponding to the fault type
  • the second fault comprehensive information is sent to the 0SS/BSS.
  • the method further includes: detecting, according to the second fault comprehensive information, whether the orchestrator includes the same fault comprehensive information as the second fault comprehensive information; and the orchestrator includes the same as the second fault comprehensive information.
  • the second fault comprehensive information is deleted.
  • the Orchestrator obtains multiple identical fault comprehensive information, where the same refers to the same fault information content in the fault comprehensive information. Orchestrator can perform repeated alarm detection. The fault comprehensive information that is being processed continues to be processed, and the same fault comprehensive information that has not been processed is deleted.
  • the second fault information further includes at least one of the following: an operating state, a fault time; the second fault comprehensive information further includes fault state information, where the fault state includes unprocessed, in process, and has been recovered. At least one of the unresolved.
  • the fault management method provided by the embodiment of the present invention receives the fault comprehensive information reported by the VNFM through the Orchestrator, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault processing are improved.
  • Figure 6a is an interaction diagram of a method of fault management in accordance with one embodiment of the present invention. The method shown in Figure 6a can be performed by the NFV system shown in Figure 1.
  • VIM obtains fault information.
  • VIM When VIM detects that any HW, Host OS, Hypervisor, and VM in the NFVI fails, VIM obtains the fault information of the failed NFVI entity. Specifically, the acquiring fault information may be generated and reported to the VIM by the faulty NFVI entity, or may be locally generated by the VIM according to the detected fault.
  • VIM detection There are several ways to detect the failure of an NFVI entity:
  • the first NFVI entity may be any HW, Host OS, Hypervisor, and VM entity in the NFVI.
  • the entity may include a hardware entity or a software entity.
  • the first NFVI entity When the first NFVI entity fails, the first NFVI entity generates fault information, and the fault information includes at least a fault entity identifier for uniquely identifying the first NFVI entity, by which the actual location of the failed first NFVI entity can be uniquely determined. Or the location in the topology relationship.
  • the fault information also includes a fault identifier for uniquely identifying a fault message.
  • the fault information also contains a fault type that indicates the cause of the fault, such as overload, power outage, memory leak, port error, no fault, and so on.
  • the fault information may also include an operating state and a fault time. The running state is used to mark whether the first NFVI entity is currently operating normally, and the fault time may be used to record the time when the fault occurred.
  • the format of the fault information can be as shown in Table 1: Fault information
  • the first NFVI After the first NFVI generates the fault information in the above format, it can be sent to the VIM through the Nf-Vi interface. Alternatively, the fault information can be sent to the 0SS/BSS through the EMS for management, recording, and presentation.
  • the VIM may send an indication message to the first NFVI entity periodically or when needed, instructing the first NFVI entity to perform fault detection, and the first NFVI entity may VIM returns the fault information similar to the above table 1. If the first NFVI has no fault, it can return no message, or return the fault type as "no fault” and the running state is "normal” as shown in Table 1. information.
  • the first NFVI entity may periodically send a heartbeat indication message to the VIM indicating that the first NFVI entity is operating normally.
  • the VIM periodically receives the heartbeat of the first NFVI entity, and senses that the first NFVI entity is working properly. When the heartbeat of the first NFVI entity is interrupted, the VIM determines that the first NFVI entity has failed.
  • the VIM can generate the fault information of the first NFVI.
  • the specific format is similar to the fault information in Table 1 above, and is not mentioned here.
  • the VIM can still detect the failure of the first NFVI entity in the first time.
  • the VIM can detect the fault of the NFVI periodically or when necessary.
  • the VIM generates the fault information of the first NFVI according to the fault detection result.
  • the specific format is similar to the fault information in Table 1 above, and is not mentioned here.
  • the VIM detection of the NFVI entity fault can be performed by any of the above methods.
  • the method can be combined by multiple methods.
  • the method 1 and the method 3 can be combined, and the NFVI entity periodically sends a heartbeat to the VIM.
  • the fault information is sent to the VIM. If the NFVI entity fails to report the fault information, the VIM can detect the failure of the NFVI entity through the heartbeat stop.
  • VIM generates fault comprehensive information.
  • the VIM After the VIM receives the fault information sent by the first NFVI entity, or the VIM generates the fault information according to the fault of the first NFVI entity, the VIM needs to collect fault information according to other NFVI entities associated with the first NFVI entity to generate a fault. Comprehensive information for easy integration.
  • FIG. 6b exemplarily shows the association relationship between HW, Host OS, Hypervisor, VM entities.
  • HW1 includes Host 0S1, Hypervisorl, VM1, and VM2. That is to say, when HW1 fails, the virtualized entities Host 0S1, Hypervisor1, VM1 and VM2 established on it will fail. at this time, The VIM can collect fault information reported by Host OS 1, Hypervisor, VM1, and VM2, and generate fault comprehensive information based on the fault information of HW1.
  • fault comprehensive information as shown in Table 2 can be generated:
  • the fault information format of the HW, Host OS, Hypervisor, and VM entities is similar to that in Table 1.
  • the fault comprehensive information identifier is used to uniquely identify a fault comprehensive information. It should be understood that the fault comprehensive information shown in Table 2 is a specific example, and the fault comprehensive information specifically includes which entity fault information is determined according to the association relationship. The fault status can be set to "unprocessed" when the fault comprehensive information is generated.
  • VIM After VIM generates fault comprehensive information, it can detect the generated fault comprehensive information locally in VIM to determine whether the same information exists. Specifically, since an NFVI entity that has a fault associated with an NFVI entity reports a fault information, the VIM is likely to generate multiple identical fault information for the same fault. For example, when HW1 fails, Host 0S1, Hypervisor1, VM1, and VM2, which are associated with HW1, also fail and perform the same operations as HW1. After collecting the associated fault information, VIM generates multiple identical fault comprehensive information. Only one of the fault comprehensive information can be processed, and other identical fault comprehensive information can be discarded. It should be understood that the same fault comprehensive information here refers to the same HW, Host OS, Hypervisor and VM fault information, and the fault identifier and fault status can be different.
  • the fault comprehensive information may be retained or discarded by the fault state of the fault comprehensive information.
  • the fault state of the fault comprehensive information just generated is “unprocessed”, and the fault comprehensive information is repeatedly detected by the police, and if the fault is found. If the same fault comprehensive information with the status "Processing" is used, the unprocessed fault comprehensive information is discarded. The reservation continues to execute the processing of the fault in the fault comprehensive information whose fault status is "in process”. 604, VIM self-healing judgment
  • the VIM can first determine whether the fault type in the fault comprehensive information is the fault type that the VIM can handle.
  • the VIM has a fault repairing strategy, where the fault repairing strategy includes a mapping relationship between the fault entity identifier, the fault type, and the fault repairing method. It can be determined whether processing can be performed by judging whether the fault type in the fault comprehensive information exists in the fault repair strategy. For example, the fault type of HW1 is "low performance" and the corresponding fault repair method is "restart".
  • the VIM can determine the fault type in the fault information of the NFVI entity for self-healing based on the priority of the NFVI entity.
  • the priority is: HW is higher than Host OS and Hypervisor is higher than VM.
  • HW is higher than Host OS
  • Hypervisor is higher than VM.
  • Table 2 when the fault comprehensive information includes fault information of HW1, Host 0S1, Hypervisor1, VM1, and VM2, VIM can preferentially handle the fault of HW1, that is, according to the fault type in the fault information of HW1.
  • low performance determine the fault repair method "restart,” start the hardware device, reload the software (Host OS, Hypervisor, etc.), migrate the VM, reload the VNF installation software, re-instantiate the VNF, enhance the VNF instance, Migrate the VNF (that is, reassign resources to the VNF) and re-instantiate the VNF Forwarding Graph.
  • 605aVIM is capable of self-healing
  • the NFVI entity is repaired according to the fault repair method. If the fault is successfully repaired and the fault of the associated NFVI entity is fixed, notify Orchestrator that the repair is successful and terminate the fault repair process.
  • the fault repair of the preferentially processed NFVI entity is successful, but the fault of other associated NFVI entities still exists, then repeat step 604 to give priority to the remaining NFVI entities that still have faults.
  • the highest-level NFVI entity judges and repairs until the failure of all NFVI entities in the fault comprehensive information is repaired, then notifies Orchestrator that the repair is successful and terminates the fault repair process.
  • the VIM can set the repair status to "processed” to prevent repeated processing of subsequently generated identical "unprocessed” fault comprehensive information.
  • the successful NFVI entity can notify the VIM fault repair success by reporting a fault message similar to Table 1 with a running status of "normal".
  • VIM can set the fault status of the fault comprehensive information to "fixed” and report it to Orchestrator through the Or-Vi interface. It should be understood that the repair may be reported by the predefined signaling, which is not limited by the present invention.
  • the NFVI entity being repaired can be isolated to prevent the faulty body from interacting with other neighboring entities and causing further fault contagion.
  • VIM can set the fault status of the fault comprehensive information to "unfixed” and pass the Or-Vi interface.
  • Orchestrator When Orchestrator receives the fault comprehensive information sent by VIM, Orchestrator detects whether it can self-heal. Similar to VIM's self-healing judgment, Orchestrator queries the local fault repair strategy. If it can be processed and the repair is successful, the fault will be integrated. The fault status is set to "Fixed” and reported to 0SS/BSS. If the Orchestrator cannot be repaired or can be repaired but the repair fails, the failure status of the NFVI fault comprehensive information is set to "unrepaired" and reported to 0SS/BSS. It should be understood that because Orchestrator is responsible for orchestrating management resources and implementing NFV services, Orchestrator has high administrative privileges and processing capabilities to fix most failures.
  • 0SS/BSS changes the fault status of the received fault comprehensive information to "Processing".
  • 0SS/BSS performs fault recovery according to the method in the fault repair strategy. After the fault is recovered, 0SS/BSS will receive the fault recovery notification sent by the NFVI entity, and then change the fault status in the 0SS/BSS fault comprehensive information to "Repaired".
  • the fault repair strategy in 0SS/BSS contains all fault type processing methods by default.
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, thereby enabling Implement fault reporting and processing in the NFV environment.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, and the faulty entity being processed is isolated, thereby improving the efficiency and accuracy of the fault processing, And effectively prevent the failure of infection.
  • FIG. 7 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • the method shown in Figure 7 can be performed by the NFV system shown in Figure 1.
  • VNFM obtains fault information.
  • the VNFM When the VNFM detects that any VNF entity in the VNF has failed, the VNFM obtains the fault information of the failed VNF entity. Specifically, the acquiring fault information may be generated by the failed VNF entity and reported to the VNFM, or may be generated locally by the VNFM according to the detected fault in the VNFM.
  • VNFM can detect the failure of a VNF entity in the following ways:
  • the first VNF entity may be any VNF entity in the VNF.
  • the entity may include a hardware entity or a software entity or instance.
  • the first VNF entity When the first VNF entity fails, the first VNF entity generates fault information, where the fault information includes at least a fault entity identifier for uniquely identifying the first VNF entity, by which the actual location of the faulty first VNF entity can be uniquely determined. Or the location in the topology relationship.
  • the fault ID is used to uniquely identify a fault message.
  • the fault message also contains a fault type that indicates the cause of the fault, such as overload, power outage, memory leak, port error, or no fault.
  • the fault information may also include an operating state and a fault time.
  • the running state is used to mark whether the first VNF entity is currently operating normally, and the fault time may be used to record the time when the fault occurred.
  • the format of the fault information can be as shown in Table 3:
  • the first VNF After the first VNF generates the fault information of the foregoing format, it may send the fault information to the VNFM through the Ve-Vnfm interface. Alternatively, the fault information may be sent to the OSS/BSS through the vEMS. Management, recording, presentation.
  • the VNFM may send an indication message to the first VNF entity periodically or when needed, instructing the first VNF entity to perform fault detection. If the first VNF entity detects the fault, the VNFM may return fault information similar to the three phases of the above table. If the first VNF is not faulty, you can return no message, or return the fault information as shown in Table 3 with the fault type "no fault" and the running status "normal".
  • the first VNF entity may periodically send a heartbeat indication message to the VNFM indicating that the first VNF entity is operating normally.
  • the VNFM periodically receives the heartbeat of the first VNF entity, and senses that the first VNF entity works normally. When the heartbeat of the first VNF entity is interrupted, the VNFM determines that the first VNF entity has failed.
  • the VNFM can generate the fault information of the first VNF.
  • the specific format is similar to the fault information in Table 3 above, and is not mentioned here.
  • the VNFM can still detect the failure of the first VNF entity in the first time.
  • the VNFM can perform fault detection on the VNF periodically or when needed. After that, the VNFM generates the fault information of the first VNF according to the fault detection result.
  • the specific format is similar to the fault information in Table 3 above, and is not described here.
  • the fault of the VNFM detecting the VNF entity can be performed by any of the above methods.
  • the method can be combined by multiple methods.
  • the method 1 and the method 3 can be combined, and the VNF entity periodically sends a heartbeat to the VNFM.
  • the fault information is sent to the VNFM. If the faulty fault cannot be reported by the VNF entity, the VNFM can detect the failure of the VNF entity through the heartbeat stop.
  • VNFM generates fault comprehensive information
  • the VNFM may generate the fault comprehensive information according to the fault information of the first VNF.
  • the VNFM may collect fault information of other VNF entities associated with the first VNF entity to generate fault comprehensive information for comprehensive processing.
  • Fig. 6b exemplarily shows the relationship between VNF entities.
  • VNF1 and VNF2 are based on VM1, that is, the relationship between VNF1 and VNF2.
  • VNF1 fails, VNF2 may also have failed.
  • the VNFM can collect the fault information reported by the VNF1 and combine the fault information of the VNF2 to generate fault comprehensive information.
  • fault comprehensive information as shown in Table 4 can be generated:
  • the fault information format of the VNF1 and VNF2 entities is similar to the three-phase of the above table. It should be understood that the fault comprehensive information shown in Table 4 is a specific example, and the fault comprehensive information specifically includes which entity fault information is determined according to the association relationship. The fault status can be set to "unprocessed" when the fault comprehensive information is generated.
  • VNFM After VNFM generates fault comprehensive information, it can detect the generated fault comprehensive information locally in VNFM to determine whether the same information exists. Specifically, because a faulty VNF entity reports a fault information after a VNF entity fails, the VNFM is likely to generate multiple identical fault information for the same fault. For example, if VNF1 fails, VNF2 associated with VNF1 also fails and performs the same operation as VNF1. VNFM generates multiple identical fault comprehensive information after collecting related fault information. At this time, only one of them can be processed. The fault comprehensive information discards other identical fault comprehensive information. It should be understood that the same fault comprehensive information herein refers to the same VNF status information, and the fault status may be different.
  • the fault comprehensive information may be reserved or discarded by the fault state of the fault comprehensive information.
  • the fault state of the fault comprehensive information just generated is “unprocessed”, and the fault comprehensive information is repeatedly detected by the police, if found If the fault status is the same fault comprehensive information of "in process", the unprocessed fault comprehensive information is discarded.
  • the VNFM can first determine whether the fault type in the fault comprehensive information is a fault type that the VNFM can handle. Specifically, the VNFM has a fault repair policy, where the fault repair strategy includes a mapping relationship between the fault entity identifier, the fault type, and the fault repair method. It can be determined whether processing can be performed by judging whether the fault type in the fault comprehensive information exists in the fault repair strategy. For example, the fault type of VNF1 is "low performance" and the corresponding fault repair method is "add a VNF instance".
  • Enable hardware devices reload software (Host OS, Hypervisor, etc.), migrate VMs, reload VNF installation software, re-instantiate VNF, enhance VNF instances, migrate VNF (re-allocate resources to VNF), re-instantiate VNF Forwarding Graph (VNF Forwarding Graph).
  • reload software Hypervisor, etc.
  • migrate VMs reload VNF installation software
  • re-instantiate VNF enhance VNF instances
  • migrate VNF re-allocate resources to VNF
  • VNF Forwarding Graph VNF Forwarding Graph
  • the VNFM judges that it can be processed, the VNF entity is repaired according to the fault repair method. If the fault is successfully repaired and the fault of the associated VNF entity is fixed, notify Orchestrator that the repair is successful and terminate the fault repair process.
  • the fault comprehensive information includes multiple VNF entities, and the fault recovery of the preferentially processed VNF entity is successful, but the fault of other associated VNF entities still exists, repeat step 704 to determine the remaining VNF entities that still have faults. And repairing, until the failure of all VNF entities in the fault comprehensive information is repaired, the Orchestrator is notified to repair successfully, and the fault repair process is terminated.
  • the VNFM can set the repair status to "processed” to prevent repeated processing of subsequently generated identical "unprocessed” fault comprehensive information.
  • a successful VNF entity can notify the VNFM that the fault repair is successful by reporting a fault message similar to Table 3 with a running status of "normal".
  • the VNFM can set the fault status of the fault comprehensive information to "fixed” and report the Orchestrator through the Or-Vnfm interface. It should be understood that the repair success can also be reported through the predefined signaling, which is not limited by the present invention.
  • VNF entity being repaired can be isolated to prevent the faulty body from interacting with other neighboring entities and causing further fault contagion.
  • VNFM is not capable of self-healing
  • VNFM can set the fault status of the fault summary information to "unrepaired” and pass the Orchestrator through the Or-Vnfm interface.
  • Orchestrator self-healing judgment When Orchestrator receives the fault comprehensive information sent by VNFM, Orchestrator detects whether it can self-heal. Similar to VNFM's self-healing judgment, Orchestrator queries the local fault repair strategy. If it can be processed and the repair is successful, the fault will be integrated. The fault status is set to "Fixed" and reported to 0SS/BSS. If the Orchestrator is unable to perform the repair process or can perform the repair process but the repair fails, the failure status of the VNF fault comprehensive information is set to "unrepaired" and reported to the 0SS/BSS.
  • Orchestrator is responsible for orchestrating management resources and implementing NFV services, Orchestrator has high management rights and processing power, which can repair most of the faults. Only a very small number of failures that cannot be processed or failed to be repaired will be reported to 0SS/BSS.
  • 0SS/BSS changes the fault status of the received fault comprehensive information to "Processing". Then 0SS/BSS performs fault recovery according to the method in the fault repair strategy. After the fault is recovered, 0SS/BSS will receive the fault recovery notification sent by the VNF entity, and then modify the fault status in the 0SS/BSS fault comprehensive information to "Fixed".
  • the fault repair strategy in 0SS/BSS contains all fault type processing methods by default.
  • Enable hardware devices reload software (Host 0S, Hypervisor, etc.), migrate VMs, reload VNF installation software, re-instantiate VNF, enhance VNF instances, migrate VNF (re-allocate resources to VNF), re-instantiate VNF Forwarding Graph (VNF Forwarding Graph).
  • reload software Hypervisor, etc.
  • migrate VMs reload VNF installation software
  • re-instantiate VNF enhance VNF instances
  • migrate VNF re-allocate resources to VNF
  • VNF Forwarding Graph VNF Forwarding Graph
  • the fault management method provided by the embodiment of the present invention acquires the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, and the faulty entity being processed is isolated, thereby improving the efficiency and accuracy of the fault processing, And effectively prevent the failure of infection.
  • FIG. 8 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • the method shown in Figure 8 can be performed by the NFV system shown in Figure 1.
  • VIM obtains fault information.
  • VIM When VIM detects that any HW, Host 0S, Hypervisor, and VM entities in the NFVI have failed, VIM obtains the fault information of the failed NFVI entity. Specifically, the acquiring fault information may be generated by the faulty NFVI entity and reported to the VIM, or may be VIM. Generated locally based on detected faults. Specifically, the method for detecting the failure of the NFVI entity by the VIM is similar to the method described in the foregoing step 601 of FIG. 6, and details are not described herein again.
  • VIM generates fault comprehensive information
  • the VIM After the VIM receives the fault information sent by the first NFVI entity, or the VIM generates the fault information according to the fault of the first NFVI entity, the VIM needs to collect fault information according to other NFVI entities associated with the first NFVI entity to generate a fault.
  • Comprehensive information for easy integration Specifically, it is similar to the method described in step 602 of FIG. 6 above, and details are not described herein again.
  • VIM After VIM generates fault comprehensive information, it can detect the generated fault comprehensive information locally in VIM to determine whether the same information exists.
  • the specific detection method is similar to the method described in the above step 603 of FIG. 6, and details are not described herein again.
  • VIM When VIM has faulty comprehensive information generated, VIM can first determine whether the fault type in the fault comprehensive information is the fault type that VIM can handle.
  • the specific judgment method is similar to the method described in the above step 604 of FIG. 6, and details are not described herein again.
  • 805aVIM is capable of self-healing
  • the NFVI entity is repaired according to the fault repair method. If the fault is successfully repaired and the fault of the associated NFVI entity is fixed, notify Orchestrator that the repair is successful and terminate the fault repair process.
  • the fault repair of the preferentially processed NFVI entity is successful, but the fault of other associated NFVI entities still exists, then repeat the steps of 804 to give priority to the remaining NFVI entities that still have faults.
  • the highest-level NFVI entity judges and repairs until the failure of all NFVI entities in the fault comprehensive information is repaired, then notifies Orchestrator that the repair is successful and terminates the fault repair process.
  • the specific method is similar to the method described in step 605a of FIG. 6 above, and details are not described herein again.
  • the NFVI entity being repaired can be isolated to prevent the faulty body from interacting with other neighboring entities and causing further fault contagion.
  • the VIM can set the fault status of the fault comprehensive information to "unrepaired” and report the VNFM through the Vi-Vnfm interface.
  • the VNFM detects whether it can perform self-healing processing. Similar to the self-healing judgment of the VIM, the VNFM queries the local fault repair strategy. If the fault can be processed and the repair is successful, the fault comprehensive information will be The fault status is set to "Fixed” and reported to Orchestrator. If the VNFM cannot be repaired or can be repaired but the repair fails, the fault status of the NFVI fault summary information is set to "unrepaired" and is directed to the Orchestrator.
  • Orchestrator When Orchestrator receives the fault information of NFVI sent by VNFM, Orchestrator detects whether it can self-heal. Similar to VIM's self-healing judgment, Orchestrator queries the local fault repair strategy. If it can be processed and the repair is successful, the fault will be integrated. The fault status in the message is set to "Repaired" and reported to the 0SS/BSS. If the Orchestrator cannot be repaired or can be repaired but the repair fails, the fault status of the NFVI fault summary information is set to "unrepaired" and reported to 0SS/BSS. It should be understood that because Orchestrator is responsible for orchestrating management resources and implementing NFV services, Orchestrator has high administrative privileges and processing power to fix most of the failures. Only a very small number of failures that cannot be handled or repaired will be reported to the 0SS/BSS.
  • 0SS/BSS changes the fault status of the received fault comprehensive information to "Processing". Then OSS/BSS performs fault recovery according to the method in the fault repair strategy. After the fault is recovered, the OSS/BSS will receive the fault recovery notification sent by the NFVI entity, and then modify the fault status in the OSS/BSS fault comprehensive information to "Fixed".
  • the fault repair policy in OSS/BSS includes all fault type processing methods by default.
  • Enable hardware devices reload software (Host 0S, Hypervisor, etc.), migrate VMs, reload VNF installation software, re-instantiate VNF, enhance VNF instances, migrate VNF (re-allocate resources to VNF), re-instantiate VNF Forwarding Graph (VNF Forwarding Graph).
  • reload software Hypervisor, etc.
  • migrate VMs reload VNF installation software
  • re-instantiate VNF enhance VNF instances
  • migrate VNF re-allocate resources to VNF
  • VNF Forwarding Graph VNF Forwarding Graph
  • the fault management method provided by the embodiment of the present invention obtains the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, and the faulty entity being processed is isolated, the efficiency and accuracy of the fault handling are improved. It is reliable and effectively prevents the infection from spreading.
  • FIG. 9 is an interaction diagram of a method of fault management according to another embodiment of the present invention.
  • the method shown in Figure 9 can be performed by the NFV system shown in Figure 1.
  • VIM obtains fault information.
  • VIM When VIM detects that any HW, Host OS, Hypervisor, and VM in the NFVI fails, VIM obtains the fault information of the failed NFVI entity. Specifically, the acquiring fault information may be generated and reported to the VIM by the faulty NFVI entity, or may be locally generated by the VIM according to the detected fault. Specifically, the method for detecting the failure of the NFVI entity by the VIM is similar to the method described in the foregoing step 601 of FIG. 6, and details are not described herein again.
  • VIM generates fault comprehensive information
  • the VIM After the VIM receives the fault information sent by the first NFVI entity, or the VIM generates the fault information according to the fault of the first NFVI entity, the VIM needs to collect fault information according to other NFVI entities associated with the first NFVI entity to generate a fault.
  • Comprehensive information for easy integration Specifically, it is similar to the method described in step 602 of FIG. 6 above, and details are not described herein again.
  • VIM After VIM generates fault comprehensive information, it can detect the generated fault comprehensive information locally in VIM to determine whether the same information exists.
  • the specific detection method is similar to the method described in the above step 603 of FIG. 6, and details are not described herein again.
  • VIM When VIM has faulty comprehensive information generated, VIM can first determine whether the fault type in the fault comprehensive information is the fault type that VIM can handle.
  • the specific judgment method is similar to the method described in the above step 604 of FIG. 6, and details are not described herein again.
  • 905aVIM is capable of self-healing
  • the NFVI entity is repaired according to the fault repair method. If the fault is successfully repaired and the fault of the associated NFVI entity is fixed, notify Orchestrator that the repair is successful and terminate the fault repair process.
  • the fault repair of the preferentially processed NFVI entity is successful, but the fault of other associated NFVI entities still exists, then repeat step 904 to give priority to the remaining NFVI entities that still have faults.
  • the highest-level NFVI entity judges and repairs until the failure of all NFVI entities in the fault comprehensive information is repaired, then notifies Orchestrator that the repair is successful and terminates the fault repair process.
  • the NFVI entity being repaired can be isolated to prevent the faulty body from interacting with other neighboring entities and causing further fault contagion.
  • the VIM can set the fault status of the fault summary information to "unrepaired" and report the VNFM through the Vi-Vnfm interface.
  • the VNFM When the VNFM receives the fault comprehensive information sent by the VIM, the VNFM detects whether it can perform self-healing processing. Similar to the self-healing judgment of the VIM, the VNFM queries the local fault repair strategy. If the fault can be processed and the repair is successful, the fault comprehensive information will be The fault status is set to "Fixed” and reported to Orchestrator. If the VNFM cannot be repaired or can be repaired but the repair fails, the fault status of the NFVI fault summary information is set to "unrepaired" and the fault summary information is returned to VIM.
  • VIM then reports the NFVI fault comprehensive information to Orchestrator via the Or-Vi interface.
  • Orchestrator detects whether it can be self-healing. Similar to VIM's self-healing judgment, Orchestrator queries the local fault repair strategy. If it can be processed and the repair is successful, the fault status in the fault comprehensive information is set to "fixed" and 0SS/BSS is reported. If the Orchestrator cannot be repaired or repaired but the repair fails, the fault status of the NFVI fault summary information is set to "unrepaired" and reported to 0SS/BSS. It should be understood that because Orchestrator is responsible for orchestrating management resources and implementing NFV services, Orchestrator has high administrative privileges and processing power to fix most of the failures. Only a very small number of failures that cannot be processed or failed to be repaired will be reported to the 0SS/BSS.
  • 0SS/BSS changes the fault status of the received fault comprehensive information to "Processing".
  • 0SS/BSS performs fault recovery according to the method in the fault repair strategy. After the fault is recovered, 0SS/BSS will receive the fault recovery notification sent by the NFVI entity, and then modify the fault status in the 0SS/BSS fault comprehensive information to "fixed".
  • the fault repair strategy in 0SS/BSS includes the processing method of all fault types by default. Enable hardware devices, reload software (Host 0S, Hypervisor, etc.), migrate VMs, re-add Install the VNF installation software, re-instantiate the VNF, enhance the VNF instance, migrate the VNF (that is, reassign the resources to the VNF), and re-instantiate the VNF Forwarding Graph.
  • Figures 6, 8, and 9 are the repair and management procedures for VIM's failure of the NFVI entity
  • Figure 7 is the process of repairing and managing the failure of the VNFM by the VNFM.
  • the two processes of the NFVI entity, the VNFM, and the VNFM for the repair and management of the VNF entity may be two processes that are relatively independent, or two processes that are performed simultaneously. The present invention does not limit this.
  • the fault management method provided by the embodiment of the present invention acquires the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, and the faulty entity being processed is isolated, thereby improving the efficiency and accuracy of the fault processing, And effectively prevent the failure of infection.
  • Figure 10 is an interaction diagram of a method of fault management in accordance with another embodiment of the present invention.
  • the method shown in Fig. 10 can be performed by the NFV system shown in Fig. 1.
  • VIM obtains fault information.
  • VIM When VIM detects that any HW, Host OS, Hypervisor, and VM in the NFVI fails, VIM obtains the fault information of the failed NFVI entity. Specifically, the acquiring fault information may be generated and reported to the VIM by the faulty NFVI entity, or may be locally generated by the VIM according to the detected fault.
  • VIM detection There are several ways to detect the failure of an NFVI entity:
  • the first NFVI entity may be any HW, Host OS, Hypervisor, and VM entity in the NFVI.
  • the entity may include a hardware entity or a software entity.
  • the first NFVI entity When the first NFVI entity fails, the first NFVI entity generates fault information, and the fault information includes at least a fault entity identifier for uniquely identifying the first NFVI entity, by which the actual location of the failed first NFVI entity can be uniquely determined. Or the location in the topology relationship.
  • the fault information also includes a fault identifier for uniquely identifying a fault information.
  • the fault information also includes a fault type, which is used to indicate the cause of the fault, such as power failure, overload, and faultless.
  • the fault information may also include an operating state and a fault time. The operating state is used to mark whether the first NFVI entity is currently operating normally, and the fault time may be used to record the time when the fault occurred.
  • the format of the fault information can be as shown in Table 1 above.
  • the first NFVI After the first NFVI generates the fault information of the above format, it can be sent to the VIM through the Nf-Vi interface. Alternatively, the fault information can be sent to the OSS/BSS through the EMS for management, recording, and presentation.
  • the VIM may send an indication message to the first NFVI entity periodically or when needed, indicating that the first NFVI entity performs fault detection, and the first NFVI entity may return a fault information similar to the above table 1 to the VIM if the fault is detected. If the first NFVI is not faulty, you can return no message, or you can return the fault information as shown in Table 1 with the fault type "no fault" and the running state "normal".
  • the first NFVI entity may periodically send a heartbeat indication message to the VIM indicating that the first NFVI entity is operating normally.
  • the VIM periodically receives the heartbeat of the first NFVI entity, and senses that the first NFVI entity is working properly. When the heartbeat of the first NFVI entity is interrupted, the VIM determines that the first NFVI entity has failed.
  • the VIM can generate the fault information of the first NFVI.
  • the specific format is similar to the fault information in Table 1 above, and is not mentioned here.
  • the VIM can still detect the failure of the first NFVI entity in the first time.
  • the VIM can detect the fault of the NFVI periodically or when necessary.
  • the VIM generates the fault information of the first NFVI according to the fault detection result.
  • the specific format is similar to the fault information in Table 1 above, and is not mentioned here.
  • the VIM detection of the NFVI entity fault can be performed by any of the above methods.
  • the method can be combined by multiple methods.
  • the method 1 and the method 3 can be combined, and the NFVI entity periodically sends a heartbeat to the VIM.
  • the fault information is sent to the VIM. If the NFVI entity fails to report the fault information, the VIM can detect the failure of the NFVI entity through the heartbeat stop.
  • VNFM obtains fault information.
  • the VNFM When the VNFM detects that any VNF entity in the VNF fails, the VNFM acquires fault information of the failed VNF entity. Specifically, the acquiring the fault information may be generated by the faulty VNF entity and reported to the VNFM, or the VNFM may be based on the detected fault. VNFM is generated locally.
  • VNFM can detect the failure of a VNF entity in the following ways:
  • the VNF entity can be any VNF entity in the VNF.
  • the entity may include a hardware entity or a software entity or instance.
  • the first VNF entity When the first VNF entity fails, the first VNF entity generates fault information, where the fault information includes at least a fault entity identifier for uniquely identifying the first VNF entity, by which the actual location of the faulty first VNF entity can be uniquely determined. Or the location in the topology relationship.
  • the fault information also contains a fault type that indicates the cause or result of the fault.
  • the fault information may also include an operating state and a fault time. The operating state is used to mark whether the first VNF entity is currently operating normally, and the fault time may be used to record the time when the fault occurred.
  • the format of the fault information can be as shown in Table 3 above.
  • the first VNF After the first VNF generates the fault information in the above format, it can be sent to the VNFM through the Ve-Vnfm interface. Alternatively, the fault information can be sent to the OSS/BSS through the vEMS for management, recording, and presentation.
  • the VNFM may send an indication message to the first VNF entity periodically or when needed, instructing the first VNF entity to perform fault detection. If the first VNF entity detects the fault, the VNFM may return fault information similar to the three phases of the above table. If the first VNF is not faulty, you can return no message, or return the fault information as shown in Table 3 with the fault type "no fault" and the running status "normal".
  • the first VNF entity may periodically send a heartbeat indication message to the VNFM indicating that the first VNF entity is operating normally.
  • the VNFM periodically receives the heartbeat of the first VNF entity, and senses that the first VNF entity works normally. When the heartbeat of the first VNF entity is interrupted, the VNFM determines that the first VNF entity has failed.
  • the VNFM can generate the fault information of the first VNF.
  • the specific format is similar to the fault information in Table 3 above, and is not mentioned here.
  • the VNFM can still detect the failure of the first VNF entity in the first time.
  • the VNFM can perform fault detection on the VNF periodically or when needed.
  • the VNFM generates the fault information of the first VNF according to the fault detection result.
  • the specific format is similar to the fault information in Table 3 above, and is not described here.
  • the fault of the VNFM detecting the VNF entity can be performed by any of the above methods.
  • the method can be combined by multiple methods.
  • the method 1 and the method 3 can be combined, and the VNF entity periodically sends a heartbeat to the VNFM.
  • the fault information is sent to the VNFM. If the faulty fault cannot be reported by the VNF entity, the VNFM can detect the failure of the VNF entity through the heartbeat stop.
  • steps 1001a and 1001b may be two relatively independent processes, or may be two related processes, which may be understood as two processes that occur substantially simultaneously in the embodiment of the present invention, that is, the embodiment of the present invention. It is a detailed description of fault management and repair in the event of a related failure of NFVI and VNF.
  • VIM generates fault comprehensive information
  • the VIM After the VIM receives the fault information sent by the first NFVI entity, or the VIM generates fault information according to the fault of the first NFVI entity, that is, after step 1001a, the VIM needs to collect fault information of other NFVI entities associated with the first NFVI entity. , to generate fault comprehensive information for easy integration.
  • Figure 6b exemplarily shows the association relationship between HW, Host OS, Hypervisor, and VM entities.
  • HW1 include Host 0S1, Hypervisor1, VM1, and VM2. That is to say, when HW1 fails, the virtualized entities Host 0S1, Hypervisor1, VM1 and VM2 established on it will fail.
  • the VIM can collect the fault information reported by Host 0S1, Hypervisorl, VM1, and VM2, and generate fault comprehensive information based on the fault information of HW1.
  • the fault comprehensive information shown in Table 2 above may be generated, where the fault information format of the HW, Host OS, Hypervisor, and VM entity is similar to Table 1 above. It should be understood that the fault comprehensive information shown in Table 2 is a specific example, and the fault comprehensive information specifically includes which entity fault information is determined according to the association relationship. The fault status can be set to "unprocessed" when the fault comprehensive information is generated.
  • VNFM generates fault comprehensive information
  • the VNFM receives the fault information sent by the first VNF entity, or the VNFM is based on the first After the fault occurs in the VNF entity to generate fault information, that is, after step 1001b, the VNFM may generate fault comprehensive information according to the fault information of the first VNF.
  • the VNFM may collect fault information of other VNF entities associated with the first VNF entity to generate fault comprehensive information for facilitating the integrated processing.
  • Fig. 7b exemplarily shows the relationship between VNF entities.
  • VNF1 and VNF2 are based on VM1, that is, the relationship between VNF1 and VNF2.
  • VNF1 fails, VNF2 may also have failed.
  • the VNFM can collect the fault information reported by the VNF1 and combine the fault information of the VNF2 to generate fault comprehensive information. Specifically, fault comprehensive information as shown in Table 4 above can be generated.
  • the fault information format of the VNF1 and VNF2 entities is similar to the three-phase of the above table. It should be understood that the fault comprehensive information shown in Table 4 is a specific example, and the fault comprehensive information specifically includes which entity fault information is determined according to the association relationship. The fault status can be set to "unprocessed" when the fault comprehensive information is generated.
  • steps 1002a and 1002b may be two relatively independent processes, or two related processes, which may be understood as two processes that occur substantially simultaneously in the embodiment of the present invention.
  • VIM After VIM generates fault comprehensive information, it can detect the generated fault comprehensive information locally in VIM to determine whether the same information exists. Specifically, since an NFVI entity that has a fault associated with an NFVI entity reports a fault information, the VIM is likely to generate multiple identical fault information for the same fault. For example, if HW1 fails, Host 0S1, Hypervisor1, VM1, and VM2, which are associated with HW1, also fail and perform the same operations as HW1. After collecting the associated fault information, VIM generates multiple identical fault comprehensive information. Only one of the fault comprehensive information can be processed, and other identical fault comprehensive information can be discarded. It should be understood that the same fault comprehensive information herein refers to the same HW, Host OS, Hypervisor and VM fault information, and the fault state can be different.
  • the fault comprehensive information may be reserved or discarded by the fault state of the fault comprehensive information.
  • the fault state of the fault comprehensive information just generated is “unprocessed”, and the fault comprehensive information is repeatedly detected by the police, if found If the fault status is the same fault comprehensive information of "in process”, the unprocessed fault comprehensive information is discarded. Retain and continue to perform fault status Handling of faults in the "comprehensive" fault comprehensive information.
  • VNFM After VNFM generates fault comprehensive information, it can detect the generated fault comprehensive information locally in VNFM to determine whether the same information exists. Specifically, because a faulty VNF entity reports a fault information after a VNF entity fails, the VNFM is likely to generate multiple identical fault information for the same fault. For example, if VNF1 fails, VNF2 associated with VNF1 also fails and performs the same operation as VNF1. VNFM generates multiple identical fault comprehensive information after collecting related fault information. At this time, only one of them can be processed. The fault comprehensive information discards other identical fault comprehensive information. It should be understood that the same fault comprehensive information herein refers to the same VNF status information, and the fault status may be different.
  • the fault comprehensive information may be reserved or discarded by the fault state of the fault comprehensive information.
  • the fault state of the fault comprehensive information just generated is “unprocessed”, and the fault comprehensive information is repeatedly detected by the police, if found If the fault status is the same fault comprehensive information of "in process”, the unprocessed fault comprehensive information is discarded.
  • the reservation continues to execute the processing of the fault in the fault comprehensive information with the fault status "Processing".
  • VIM When VIM has faulty comprehensive information generated, VIM can first determine whether the fault type in the fault comprehensive information is the fault type that VIM can handle.
  • the VIM has a fault repairing strategy, where the fault repairing strategy includes a mapping relationship between the fault entity identifier, the fault type, and the fault repairing method. It can be determined whether processing can be performed by judging whether the fault type in the fault comprehensive information exists in the fault repair strategy. For example, the fault type of HW1 is "low performance" and the corresponding fault repair method is "restart".
  • the VIM can determine the fault type in the fault information of the NFVI entity for self-healing based on the priority of the NFVI entity.
  • the priority is: HW is higher than Host OS and Hypervisor is higher than VM.
  • HW is higher than Host OS
  • Hypervisor is higher than VM.
  • Table 2 when the fault comprehensive information includes fault information of HW1, Host 0S1, Hypervisor1, VM1, and VM2, VIM can preferentially handle the fault of HW1, that is, according to the fault type in the fault information of HW1. For example "low performance”, determine the fault repair method "restart,,.
  • VNFM self-healing judgment
  • the VNFM can first determine whether the fault type in the fault comprehensive information is a fault type that the VNFM can handle.
  • the VNFM has a fault repair policy, which includes a mapping relationship between the fault entity identifier, the fault type, and the fault repair method. It can be determined whether processing can be performed by determining whether the fault type in the fault comprehensive information exists in the fault repair strategy. For example, the fault type of VNF1 is "low performance" and the corresponding fault repair method is "add a VNF instance”.
  • Enable hardware devices reload software (Host OS, Hypervisor, etc.), migrate VMs, reload VNF installation software, re-instantiate VNF, enhance VNF instances, migrate VNF (re-allocate resources to VNF), re-instantiate VNF Forwarding Graph (VNF Forwarding Graph).
  • 1005aVIM is capable of self-healing
  • the NFVI entity is repaired according to the fault repair method. If the fault is successfully repaired and the fault of the associated NFVI entity is fixed, notify Orchestrator that the repair is successful and terminate the fault repair process.
  • the fault repair of the preferentially processed NFVI entity succeeds, but the fault of other associated NFVI entities still exists, repeat the steps of 1004a, and give priority to the remaining NFVI entities that still have faults.
  • the highest-level NFVI entity judges and repairs until the failure of all NFVI entities in the fault comprehensive information is repaired, then notifies Orchestrator that the repair is successful and terminates the fault repair process.
  • the VIM can set the repair status to "processed” to prevent repeated processing of subsequently generated identical "unprocessed” fault comprehensive information.
  • the successful NFVI entity can notify the VIM fault repair success by reporting a fault message similar to Table 1 with a running status of "normal".
  • VIM can set the fault status of the fault comprehensive information to "fixed” and pass the Or-Vi interface.
  • Fe OrchestratoL should understand that the repair is successful.
  • the present invention is not limited by the present invention.
  • the NFVI entity being repaired can be isolated to prevent the faulty body from interacting with other neighboring entities and causing further fault contagion.
  • 1005bVNFM is capable of self-healing
  • the VNF entity is fault-repaired according to the fault repair method. If the fault is repaired successfully, and the fault of the VNF entity with the associated relationship is repaired Repeat, notify Orchestrator that the repair is successful, and terminate the fault repair process. If the fault comprehensive information includes multiple VNF entities, the fault recovery of the preferentially processed VNF entity is successful, but the fault of other associated VNF entities still exists, and the step of 1004b is repeated to judge the remaining VNF entities that still have faults. And repairing, until the failure of all VNF entities in the fault comprehensive information is repaired, the Orchestrator is notified to repair successfully, and the fault repair process is terminated.
  • the VNFM can set the repair status to "processed” to prevent repeated processing of subsequently generated identical "unprocessed” fault comprehensive information.
  • a successful VNF entity can notify the VNFM that the fault repair is successful by reporting a fault message similar to Table 3 with a running status of "normal".
  • the VNFM can set the fault status of the fault comprehensive information to "fixed” and report the Orchestrator through the Or-Vnfm interface. It should be understood that the repair success can also be reported through the predefined signaling, which is not limited by the present invention.
  • VNF entity being repaired can be isolated to prevent the faulty body from interacting with other neighboring entities and causing further fault contagion.
  • the VIM After the fault in the fault recovery strategy in the VIM does not include the fault type of the NFVI entity to be repaired, the VIM requests the VNFM for the fault information of the VNF entity associated with the first VNFI entity.
  • the VIM receives the fault information of the VNF entity associated with the first VNFI entity sent by the VNFM, and adds the received fault information to the fault comprehensive information of the original NFVI, and then reports the integrated fault comprehensive information to the Orchestrator through the Or-Vi interface. .
  • the NFVI entities associated with HW1 have Host 0S1, Hypervisor1, VM1, and VM2, which are further associated with VNF, and VNFI and VNF2 are also associated with HW1, if VNF1 also occurs.
  • the VNFM sends the fault information of VNF1 to the VIM through the Vi-Vnfm interface, so that the VIM performs comprehensive processing and reporting.
  • the VNFM requests the VIM for the fault information of the NFVI entity associated with the first VNF entity.
  • the VNFM receives the fault information of the NFVI entity associated with the first VNF entity sent by the VIM, and adds the received fault information to the fault of the original VNF.
  • Comprehensive information and then report the comprehensive fault information to the Orchestrator through the Or-Vnfm interface.
  • the NFVI entities associated with VNF1 are VM1, Host OS1, Hypervisor1, HWl, and HW2.
  • VIM will VM1.
  • the fault information of Host 0S1, Hypervisorl, and HWl is sent to the VNFM through the Vi-Vnfm interface, so that the VNFM performs comprehensive processing and reporting.
  • Orchestrator receives the comprehensive fault information (1005c or 1005d) reported by VNFM or VIM. Orchestrator detects whether the fault comprehensive information can be self-healing. Similar to VIM self-healing judgment, Orchestrator queries local fault repair strategy. If the processing is successful and the repair is successful, the fault status in the fault comprehensive information is set to "fixed” and reported to the OSS/BSS. If the Orchestrator cannot be repaired or can be repaired but the repair fails, the fault status of the NFVI fault summary information is set to "unrepaired" and reported to 0SS/BSS. It should be understood that because Orchestrator is responsible for orchestrating management resources and implementing NFV services, Orchestrator has high administrative privileges and processing power to fix most of the failures. Only a very small number of failures that cannot be processed or failed to be repaired will be reported to 0SS/BSS.
  • 0SS/BSS changes the fault status of the received fault comprehensive information to "Processing". Then OSS/BSS performs fault recovery according to the method in the fault repair strategy. After the fault is recovered, the OSS/BSS will receive the fault recovery notification sent by the NFVI entity, and then modify the fault status in the OSS/BSS fault comprehensive information to "Fixed".
  • the fault repair policy in OSS/BSS includes all fault type processing methods by default.
  • the fault management method provided by the embodiment of the present invention acquires the fault information of the hardware and/or the software entity through the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, and the faulty entity being processed is isolated, thereby improving the efficiency and accuracy of the fault processing, And effectively prevent the failure of infection.
  • FIG. 11 is a schematic block diagram of a virtualized infrastructure management VIM entity in accordance with one embodiment of the present invention.
  • the VIM entity 1100 shown in FIG. 11 includes an obtaining unit 1101, a generating unit 1102, and processing. Unit 1103.
  • the obtaining unit 1101 acquires the first fault information of the network function virtualization infrastructure NFVI entity including the fault entity identifier and the fault type, and the first fault information is used to indicate that the first NFVI entity with the fault entity identifier fails.
  • the generating unit 1102 generates first fault comprehensive information according to the first fault information acquired by the obtaining unit 1101, where the first fault comprehensive information includes the first fault information and the associated fault information of the first fault information;
  • the processing unit 1103 performs fault repair or processing based on the first fault comprehensive information generated by the generating unit 1102.
  • the VIM entity 1100 acquires the fault information of the hardware and/or the software entity, and comprehensively processes the fault information with the associated relationship, so that the fault can be realized and processed in the NFV environment.
  • the VIM entity 1100 further includes a determining unit and a receiving unit, where the acquiring unit is configured to: receive, by the receiving unit, first fault information sent by the first NFVI entity; or determine, by using the determining unit, the first NFVI entity. A failure occurs and a first failure message is generated based on a failure of the first NFVI entity.
  • the first NFVI entity is any one of the NFVI entities, the main operating system Host OS, the virtual machine manager, or the virtual machine VM entity, and the generating unit 1102 is specifically configured to: determine by the determining unit The fault information sent by the NFVI entity associated with the first NFVI entity is associated fault information of the first fault information; and the first fault comprehensive information including the first fault information and the associated fault information is generated.
  • the processing unit 1103 includes a sending unit, where the processing unit 1103 is specifically configured to: determine, according to the fault type of the first fault information in the first fault comprehensive information or the fault type of the associated fault information, by the determining unit. Whether the VIM entity 1100 includes a fault repair policy corresponding to the fault type of the first fault information or the fault type of the associated fault information; the VIM entity 1100 includes a fault type corresponding to the first fault information or a fault type of the associated fault information In the fault repair strategy, the fault of the first NFVI entity and/or the NFVI entity associated with the first NFVI entity is repaired according to the fault repair policy; or the fault type or associated fault information of the first fault information is not included in the VIM entity 1100 When the fault type corresponds to the fault repair strategy, the first fault comprehensive information is sent to the VNFM through the sending unit or the first fault comprehensive information is sent to the orchestrator.
  • the processing unit 1103 is specifically configured to: determine, by the determining unit, the NFVI entity with the highest priority among the first NFVI entity and the NFVI entity associated with the first NFVI entity, where the priority of the HW Higher than the priority of the Host OS, the priority of the Host OS is higher than the priority of the virtual machine manager, and the priority of the virtual machine manager is higher than the priority of the VM; according to the fault type of the highest priority NFVI entity, The unit determines whether the VIM entity 1100 includes a corresponding fault repair policy; when the VIM entity 1100 includes a fault repair policy corresponding to the fault type of the highest priority NFVI entity, repairing the fault of the highest priority NFVI entity according to the fault repair strategy .
  • the sending unit is specifically configured to: when the fault repair succeeds, send a success indication message to the orchestration device; when the fault repair fails, send the first fault comprehensive information to the VNFM or send the first information to the orchestrator Fault comprehensive information.
  • the receiving unit is further configured to: receive an indication message sent by the VNFM to indicate that the VNFM cannot process the first fault comprehensive information; and the sending unit is further configured to: send the first fault comprehensive information to the orchestrator.
  • the processing unit 1103 is further configured to: request, by the VNFM, fault information of the VNF entity associated with the first NFVI entity; and add the fault information of the VNF entity associated with the first NFVI entity to the first Fault comprehensive information.
  • the receiving unit is further configured to: receive request information sent by the VNFM, where the request information is used to request, from the VIM entity 1100, fault information of the NFVI entity associated with the failed VNF entity; The fault information of the NFVI entity associated with the failed VNF entity is sent to the VNFM.
  • the VIM entity 1100 further includes a detecting unit and a deleting unit, where the detecting unit is configured to: detect, according to the first fault comprehensive information, whether the VIM entity 1100 includes fault comprehensive information that is the same as the first fault comprehensive information;
  • the deleting unit is specifically configured to delete the first fault comprehensive information when the VIM entity 1100 includes the same fault comprehensive information as the first fault comprehensive information.
  • the fault information of the VIM entity 1100 hardware and/or software entity provided by the embodiment of the present invention comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault handling are improved.
  • FIG. 12 is a schematic block diagram of a virtual network function management VNFM entity according to an embodiment of the present invention.
  • the VNFM entity 1200 shown in FIG. 12 includes an obtaining unit 1201, a generating unit 1202, and a processing unit 1203.
  • the obtaining unit 1201 acquires the second fault information of the virtual network function VNF entity that includes the fault entity identifier and the fault type, and the second fault information is used to indicate that the first VNF entity with the fault entity identifier fails.
  • the generating unit 1202 generates second fault comprehensive information according to the second fault information.
  • the processing unit 1203 performs fault repair or report processing according to the second fault comprehensive information. The information of the fault information with the associated relationship is comprehensively processed, so that the fault can be realized and processed in the NFV environment.
  • the VNFM entity 1200 further includes a determining unit and a receiving unit, where the acquiring unit is configured to: receive, by the receiving unit, second fault information sent by the first VNF entity; or determine, by using the determining unit, the first VNF entity. A fault occurs, and the second fault information is generated by the generating unit according to the fault that occurs in the first VNF entity.
  • the generating unit 1202 is specifically configured to: determine, by the determining unit, that the fault information sent by the VNF entity associated with the first VNF entity is associated fault information of the second fault information; and the generating includes the second fault Information and second fault comprehensive information of the associated fault information.
  • the processing unit 1203 includes a sending unit, where the processing unit is specifically configured to: determine, according to the fault type of the second fault information in the second fault comprehensive information or the fault type of the associated fault information, determine the VNFM by using the determining unit.
  • the entity 1200 includes a fault repair strategy corresponding to the fault type of the second fault information or the fault type of the associated fault information; the VNFM entity 1200 includes a fault corresponding to the fault type of the second fault information or the fault type of the associated fault information
  • the fault of the first VNF entity and/or the VNF entity associated with the first VNF entity is repaired according to the fault repair policy; or the fault of the fault type or associated fault information with the second fault information is not included in the VNFM entity 1200
  • the second fault comprehensive information is sent to the arranger through the sending unit.
  • the sending unit is specifically configured to: when the fault repair succeeds, send a success indication message to the orchestrator; and when the fault repair fails, send the second fault comprehensive information to the orchestrator.
  • the processing unit 1203 is further configured to: request, by the virtualization infrastructure manager VIM, fault information of an NFVI entity associated with the first VNF entity, where
  • the NFVI entity is any hardware HW in the NFVI, a host operating system Host OS, a virtual machine manager or a virtual machine VM entity; the fault information of the NFVI entity associated with the first VNF entity is added to the second fault comprehensive information.
  • the processing unit 1203 is further configured to: receive first fault comprehensive information sent by the VIM, where the first fault comprehensive information includes first fault information and associated fault information of the first fault information, and the first fault information Used to indicate that the first NFVI entity is faulty; determining whether the VNFM entity 1200 includes a fault repairing policy corresponding to the fault type of the first fault information in the first fault comprehensive information or the fault type of the associated fault information; Recovering a fault of the first NFVI entity and/or the NFVI entity associated with the first NFVI entity according to the fault repair policy when the fault repair strategy corresponding to the fault type of the first fault information or the fault type associated with the fault information; When the VNFM entity 1200 does not include a fault repair policy corresponding to the fault type of the first fault information or the fault type associated with the fault information, the first fault comprehensive information is sent to the orchestrator, or is sent to the VIM to indicate that the VNFM entity 1200 cannot process the fault.
  • the first fault comprehensive information indication message to It is convenient
  • the processing unit 1203 is further configured to: determine, according to the first fault synthesis information, the first VNF entity associated with the first NFVI entity and/or the NFVI entity associated with the first NFVI entity.
  • the fault information is added to the first fault comprehensive information by the fault information of the first VNF entity, so that the VNFM entity 1200 repairs or processes the first fault comprehensive information.
  • the VNFM entity 1200 further includes a detecting unit and a deleting unit, where the detecting unit is configured to: detect, according to the second fault comprehensive information, whether the VNFM entity 1200 includes fault comprehensive information that is the same as the second fault comprehensive information;
  • the deleting unit is specifically configured to delete the second fault comprehensive information when the VNFM entity 1200 includes the same fault comprehensive information as the second fault comprehensive information.
  • the receiving unit is further configured to: receive request information sent by the VIM, where the request information is used to request, from the VNFM entity 1200, fault information of the VNF entity associated with the failed NFVI entity; The fault information of the VNF entity associated with the failed NFVI entity is sent to the VIM.
  • the fault information of the hardware and/or software entity of the VNFM entity 1200 provided by the embodiment of the present invention performs comprehensive processing on the fault information with the associated relationship, so that the NFV ring can be realized. Fault reporting and processing under the border.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault processing are improved.
  • FIG 13 is a schematic block diagram of an orchestrator Orchestrator entity in accordance with one embodiment of the present invention.
  • the Orchestrator entity 1300 shown in Figure 12 includes a receiving unit 1301 and a processing unit 1302.
  • the receiving unit 1301 receives the first fault comprehensive information sent by the virtualized infrastructure manager VIM, where the first fault comprehensive information includes first fault information, the first fault information includes a fault entity identifier and a fault type, and the first fault information is used for A failure is indicated for the first network function virtualization infrastructure NFVI entity with the faulty entity identity.
  • the processing unit 1302 performs fault repair or report processing according to the first fault comprehensive information.
  • the receiving unit 1301 receives the second fault comprehensive information sent by the virtual network function manager VNFM, where the second fault comprehensive information includes the second fault information, the second fault information includes the fault entity identifier and the fault type, and the second fault information is used to indicate The first virtual network function VNF entity with the faulty entity identity fails.
  • the processing unit 1302 performs fault repair or report processing according to the second fault comprehensive information.
  • the Orchestrator entity 1300 obtained by the embodiment of the present invention obtains the fault information of the hardware and/or software entity from the VNFM or the VIM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the first fault comprehensive information further includes: fault information of the NFVI entity associated with the first NFVI entity; and/or fault information of the virtual network function VNF entity associated with the first NFVI entity .
  • the second fault comprehensive information further includes: fault information of the VNF entity associated with the first VNF entity; and/or a virtualized infrastructure management NFVI entity associated with the first VNF entity accident details.
  • the processing unit 1302 is specifically configured to: determine, according to the type of the fault in the first fault comprehensive information, whether the Orchestrator entity 1300 includes a fault repair policy corresponding to the fault type; the fault and the fault in the Orchestrator entity 1300 When the type corresponds to the fault repair strategy, the fault of the first NFVI entity and/or the NFVI entity associated with the first NFVI entity is repaired according to the fault repair policy; or the Orchestrator entity 1300 does not contain a fault repair strategy corresponding to the fault type When sending the first reason to the operation and business support system 0SS/BSS Comprehensive information on obstacles.
  • the processing unit 1302 is specifically configured to: determine, according to the type of the fault in the second fault comprehensive information, whether the Orchestrator entity 1300 includes a fault repair policy corresponding to the fault type; the fault and the fault in the Orchestrator entity 1300 The type corresponding to the fault repair strategy, repairing the failure of the first VNF entity and/or the VNF entity associated with the first VNF entity according to the fault repair policy; or the Orchestrator entity 1300 does not include a fault repair strategy corresponding to the fault type
  • the second fault comprehensive information is sent to the operation and service support system 0SS/BSS.
  • the processing unit 1302 is specifically configured to: determine, according to the type of the fault in the first fault comprehensive information, whether the Orchestrator entity 1300 includes a fault repair policy corresponding to the fault type; the fault and the fault in the Orchestrator entity 1300 When the type corresponds to the fault repair strategy, repairing the failure of the first NFVI entity and the NFVI entity associated with the first NFVI entity and the failure of the VNF entity associated with the first NFVI entity according to the fault repair policy; or at the Orchestrator entity 1300 When the fault repair policy corresponding to the fault type is not included, the first fault comprehensive information is sent to the 0SS/BSS.
  • the processing unit 1302 is specifically configured to: determine, according to the type of the fault in the second fault comprehensive information, whether the Orchestrator entity 1300 includes a fault repair policy corresponding to the fault type; the fault and the fault in the Orchestrator entity 1300 The type corresponding to the fault repair policy, repairing the failure of the first VNF entity and the VNF entity associated with the first VNF entity and the failure of the NFVI entity associated with the first VNF entity according to the fault repair policy; or at the Orchestrator entity 1300 When the fault repair strategy corresponding to the fault type is not included, the second fault comprehensive information is sent to the 0SS/BSS.
  • the Orchestrator entity 1300 further includes a detecting unit and a deleting unit, where the detecting unit is configured to: detect, according to the first/second fault comprehensive information, whether the Orchestrator entity 1300 includes the same as the first/second fault comprehensive information.
  • the fault comprehensive information; the deleting unit is configured to delete the first/second fault comprehensive information when the Orchestrator entity 1300 includes the same fault comprehensive information as the first/second fault comprehensive information.
  • the Orchestrator entity 1300 acquires the fault information of the hardware and/or the software entity from the VIM or the VNFM, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the same fault comprehensive information is performed by repeated alarm detection. The deletion process improves the efficiency and accuracy of the fault handling.
  • FIG 14 is a schematic block diagram of a VIM entity in accordance with another embodiment of the present invention.
  • the VIM entity 1400 of Figure 14 includes a processor 1401 and a memory 1402.
  • the processor 1401 and the memory 1402 are connected by a bus system 1403.
  • the first fault information of the entity includes the fault entity identifier and the fault type, and the first fault information is used to indicate that the first NFVI entity with the fault entity identifier is faulty; and the first fault comprehensive information is generated according to the first fault information, where the first fault is integrated.
  • the information includes the first fault information and the associated fault information of the first fault information; and the fault repair or report processing is performed according to the first fault comprehensive information.
  • the VIM entity 1400 acquires the fault information of the hardware and/or the software entity, and comprehensively processes the fault information with the associated relationship, so that the fault can be realized and processed in the NFV environment.
  • the VIM entity 1400 can also include a transmitting circuit 1404 and a receiving circuit 1405.
  • the processor 1401 controls the operation of the VIM entity 1400, which may also be referred to as a CPU (Central Processing Unit).
  • Memory 1402 can include read only memory and random access memory and provides instructions and data to processor 1401. A portion of memory 1402 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the various components of the VIM entity 1400 are coupled together by a bus system 1403, which may include, in addition to the data bus, a power bus, a control bus, and a status signal bus. However, for clarity of description, various buses are labeled as bus system 1403 in the figure.
  • the method disclosed in the foregoing embodiment of the present invention may be applied to the processor 1401 or implemented by the processor 1401.
  • the processor 1401 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 1401 or an instruction in a form of software.
  • the processor 1401 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or executed.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, Registers and other mature storage media in the field.
  • the storage medium is located in the memory 1402.
  • the processor 1401 reads the information in the memory 1402 and completes the steps of the above method in combination with its hardware.
  • the fault information of the VIM entity 1400 hardware and/or software entity provided by the embodiment of the present invention comprehensively processes the fault information with the associated relationship, so that fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault handling are improved.
  • FIG. 15 is a schematic block diagram of a VNFM entity in accordance with another embodiment of the present invention.
  • the VNFM entity 1500 of Figure 15 includes a processor 1501 and a memory 1502.
  • the processor 1501 and the memory 1502 are connected by a bus system 1503.
  • the memory 1502 is configured to store an instruction that causes the processor 1501 to: obtain a second fault information including a fault entity identifier and a fault type of the virtual network function VNF entity, the second fault information being used to indicate the first VNF having the fault entity identifier The entity has failed. Generating second fault comprehensive information according to the second fault information. The fault is repaired or processed according to the second fault comprehensive information. The information of the fault information with the associated relationship is comprehensively processed, so that the fault can be realized and processed in the NFV environment.
  • the VNFM entity 1500 can also include a transmit circuit 1504 and a receive circuit 1505.
  • the processor 1501 controls the operation of the VNFM entity 1500, which may also be referred to as a CPU (Central Processing Unit).
  • Memory 1502 can include read only memory and random access memory and provides instructions and data to processor 1501. Portions of memory 1502 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the various components of the VNFM entity 1500 are coupled together by a bus system 1503, which may include, in addition to the data bus, a power bus, a control bus, and a status signal bus. However, for clarity of description, various buses are labeled as bus system 1503 in the figure.
  • the method disclosed in the foregoing embodiment of the present invention may be applied to the processor 1501 or implemented by the processor 1501.
  • the processor 1501 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 1501 or an instruction in a form of software.
  • the processor 1501 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), or an off-the-shelf programmable gate array (FPGA). Or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • Other programmable logic devices discrete gates or transistor logic devices, discrete hardware components.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 1502.
  • the processor 1501 reads the information in the memory 1502 and completes the steps of the above method in combination with hardware.
  • the fault information of the hardware and/or software entity of the VNFM entity 1500 provided by the embodiment of the present invention comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of fault processing are improved.
  • FIG. 16 is a schematic block diagram of an Orchestrator entity in accordance with another embodiment of the present invention.
  • the Orchestrator entity 1600 of Figure 16 includes a processor 1601 and a memory 1602.
  • the processor 1601 and the memory 1602 are connected by a bus system 1603.
  • the memory 1602 is configured to store an instruction that causes the processor 1601 to: receive the first fault comprehensive information sent by the virtualization infrastructure manager VIM, where the first fault comprehensive information includes first fault information, and the first fault information includes a fault.
  • the entity identifier and the fault type, the first fault information is used to indicate that the first network function virtualization infrastructure NFVI entity with the fault entity identifier fails. Perform fault repair or report processing according to the first fault comprehensive information.
  • the second fault comprehensive information includes the second fault information
  • the second fault information includes the fault entity identifier and the fault type
  • the second fault information is used to indicate that the fault is faulty.
  • the first virtual network function VNF entity of the entity identifier fails; the fault repair or report processing is performed according to the second fault comprehensive information.
  • the Orchestrator entity 1600 acquires the fault information of the hardware and/or the software entity, and comprehensively processes the fault information with the associated relationship, so that the fault reporting and processing in the NFV environment can be realized.
  • the Orchestrator entity 1600 may also include a transmitting circuit 1604 and a receiving circuit 1605.
  • the processor 1601 controls the operation of the Orchestrator entity 1600, which may also be referred to as a CPU (Central Processing Unit).
  • the memory 1602 can include read only The memory and random access memory provide instructions and data to the processor 1601.
  • a portion of the memory 1602 can also include non-volatile random access memory (NVRAM).
  • the various components of the Orchestrator entity 1600 are coupled together by a bus system 1603, which may include, in addition to the data bus, a power bus, a control bus, a status signal bus, and the like. However, for clarity of description, various buses are labeled as bus system 1603 in the figure.
  • the method disclosed in the above embodiments of the present invention may be applied to the processor 1601 or implemented by the processor 1601.
  • the processor 1601 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 1601 or an instruction in the form of software.
  • the processor 1601 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor can be a microprocessor or the processor can be any conventional processor or the like.
  • the steps of the method disclosed in connection with the embodiments of the present invention may be directly embodied by the execution of the hardware decoding processor, or by a combination of hardware and software modules in the decoding processor.
  • the software modules can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 1602, and the processor 1601 reads the information in the memory 1602, and completes the steps of the above method in combination with the hardware thereof.
  • the fault information of the Orchestrator entity 1600 hardware and/or software entity provided by the embodiment of the present invention comprehensively processes the fault information with the associated relationship, thereby enabling fault reporting and processing in the NFV environment.
  • the associated fault information is comprehensively processed, and the same fault comprehensive information is deleted by repeated alarm detection, the efficiency and accuracy of the fault processing are improved.
  • the methods or steps described in connection with the embodiments disclosed herein may be performed by hardware, processor Software program, or a combination of the two to implement.
  • the software program can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.
  • RAM random access memory
  • ROM read only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.
  • the invention is not limited to this.
  • Various equivalent modifications and alterations to the embodiments of the invention may be made by those skilled in the art without departing from the spirit and scope of the invention.

Abstract

本发明提供一种故障管理方法,能够实现NFV环境下的故障上报及处理。该方法包括:获取网络功能虚拟化基础设施NFVI实体的包含故障实体标识和故障类型的第一故障信息,第一故障信息用于指示具有故障实体标识的第一NFVI实体发生故障;根据第一故障信息生成第一故障综合信息,第一故障综合信息包含第一故障信息和第一故障信息的关联故障信息;根据第一故障综合信息进行故障修复或上报处理。本发明实施例通过获取硬件和/ 或软件实体的故障信息,对具有关联关系的故障信息进行综合处理,从而能够实现能够实现NFV环境下的故障上报及处理。

Description

故障管理的方法、 实体和系统 技术领域
本发明涉及通信领域, 并且更具体地, 涉及故障管理的方法、 实体和系 统。 背景技术
网络功能虚拟化(Network Function Virtulization, NFV ) 旨在利用通用 的高性能大容量服务器、 交换机和存储设备来实现一些网络功能的软件化。 NFV端到端 (End to End, E2E ) 架构相对于原有的普通虚拟环境增加了许 多软件实例和管理实体,例如虚拟网络功能( Virtual Network Function , VNF ) 实例 /实体,虚拟化基础设施管理器( Virtualization Management System, VIM ) 实体, VNF管理器实体等等, 使得 NFV环境比普通虚拟环境更加复杂。 普 通虚拟环境下的故障上报及处理方法无法适用于 NFV环境。 因此, 需要考 虑在复杂的 NFV环境下如何进行故障上报及处理。 发明内容
本发明实施例提供一种故障管理方法, 能够实现 NFV环境下的故障上 报及处理。
第一方面, 提供了一种故障管理方法, 包括: 虚拟化基础设施管理器
VIM获取网络功能虚拟化基础设施 NFVI实体的包含故障实体标识和故障类 型的第一故障信息, 所述第一故障信息用于指示具有所述故障实体标识的第 一 NFVI实体发生故障;所述 VIM根据所述第一故障信息生成第一故障综合 信息, 所述第一故障综合信息包含所述第一故障信息和所述第一故障信息的 关联故障信息; 所述 VIM根据所述第一故障综合信息进行故障修复或上报 处理。
结合第一方面,在其第一种实现方式中,所述 VIM获取 NFVI实体的包 含故障实体标识和故障类型的第一故障信息, 包括: 接收所述第一 NFVI实 体发送的所述第一故障信息; 或者确定所述第一 NFVI实体发生故障, 并根 据所述第一 NFVI实体发生的故障生成所述第一故障信息。
结合第一方面及其上述实现方式, 在其第二种实现方式中, 所述第一 NFVI实体为所述 NFVI实体中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟机管理器或虚拟机 VM实体, 所述 VIM根据所述第一故障信息生成第 一故障综合信息, 包括: 确定与所述第一 NFVI实体相关联的 NFVI实体发 送的故障信息为所述第一故障信息的关联故障信息; 生成包含有所述第一故 障信息和所述关联故障信息的第一故障综合信息。
结合第一方面及其上述实现方式, 在其第三种实现方式中, 所述 VIM 根据所述第一故障综合信息进行故障修复或上报处理, 包括: 根据所述第一 故障综合信息中的第一故障信息的故障类型或者所述关联故障信息的故障 类型, 确定所述 VIM是否包含与所述第一故障信息的故障类型或者所述关 联故障信息的故障类型相对应的故障修复策略; 在所述 VIM 包含与所述第 一故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障修 复策略时,根据所述故障修复策略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者在所述 VIM不包含与所述第一 故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障修复 策略时, 向 VNFM发送所述第一故障综合信息或者向编排器发送所述第一 故障综合信息。
结合第一方面及其上述实现方式, 在其第四种实现方式中, 所述根据所 述第一故障综合信息中的第一故障信息的故障类型或者所述关联故障信息 的故障类型, 确定所述 VIM是否包含与所述第一故障信息的故障类型或者 所述关联故障信息的故障类型相对应的故障修复策略, 包括: 在所述第一 NFVI实体和与所述第一 NFVI实体相关联的 NFVI实体中确定优先级最高的 NFVI实体, 其中, HW的优先级高于 Host OS的优先级, Host OS的优先级 高于虚拟机管理器的优先级, 虚拟机管理器的优先级高于 VM的优先级; 根 据所述优先级最高的 NFVI实体的故障类型确定所述 VIM是否包含相对应的 故障修复策略;在所述 VIM包含与所述优先级最高的 NFVI实体的故障类型 相对应的故障修复策略时, 根据所述故障修复策略修复所述优先级最高的 NFVI实体的故障。
结合第一方面及其上述实现方式, 在其第五种实现方式中, 所述根据所 述故障修复策略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联 的 NFVI实体的故障之后, 还包括: 在所述故障修复成功时, 向所述编排器 发送成功指示消息; 在所述故障修复失败时, 向所述 VNFM发送所述第一 故障综合信息或者向所述编排器发送所述第一故障综合信息。
结合第一方面及其上述实现方式, 在其第六种实现方式中, 所述向
VNFM发送所述第一故障综合信息之后, 还包括: 接收所述 VNFM发送的 用于指示所述 VNFM无法处理所述第一故障综合信息的指示消息; 向编排 器发送所述第一故障综合信息。
结合第一方面及其上述实现方式, 在其第七种实现方式中, 所述向编排 器发送所述第一故障综合信息之前,还包括:向 VNFM请求与所述第一 NFVI 实体相关联的 VNF实体的故障信息; 将所述与所述第一 NFVI实体相关联 的 VNF实体的故障信息加入所述第一故障综合信息。
结合第一方面及其上述实现方式, 在其第八种实现方式中, 所述方法还 包括: 接收所述 VNFM发送的请求信息, 所述请求信息用于向所述 VIM请 求与发生故障的 VNF实体相关联的 NFVI实体的故障信息; 向所述 VNFM 发送所述与发生故障的 VNF实体相关联的 NFVI实体的故障信息。
结合第一方面及其上述实现方式, 在其第九种实现方式中, 所述 VIM 根据所述第一故障信息生成第一故障综合信息之后, 还包括: 根据所述第一 故障综合信息检测所述 VIM是否包含与所述第一故障综合信息相同的故障 综合信息; 在所述 VIM 包含与所述第一故障综合信息相同的故障综合信息 时, 删除所述第一故障综合信息。
结合第一方面及其上述实现方式, 在其第十种实现方式中, 所述第一故 障信息还被用于向运营和业务支撑系统 OSS/BSS上报,以便于所述 OSS/BSS 监控并呈现所述第一故障信息。
结合第一方面及其上述实现方式, 在其第十一种实现方式中, 所述第一 故障信息还包括以下至少一项: 运行状态、 故障时间; 所述第一故障综合信 息还包括故障状态信息, 所述故障状态包含未处理, 处理中, 已修复和未修 复中的至少一种。
第二方面,提供了一种故障管理方法,包括:虚拟网络功能管理器 VNFM 获取虚拟网络功能 VNF实体的包含故障实体标识和故障类型的第二故障信 息, 所述第二故障信息用于指示具有所述故障实体标识的第一 VNF实体发 生故障; 所述 VNFM根据所述第二故障信息生成第二故障综合信息; 所述 VNFM根据所述第二故障综合信息进行故障修复或上报处理。
结合第二方面, 在其第一种实现方式中, 所述 VNFM获取 VNF实体的 包含故障实体标识和故障类型的第二故障信息, 包括: 接收所述第一 VNF 实体发送的所述第二故障信息; 或者确定所述第一 VNF实体发生故障, 并 根据所述第一 VNF实体发生的故障生成所述第二故障信息。
结合第二方面及其上述实现方式, 在其第二种实现方式中, 所述 VNFM 根据所述第二故障信息生成第二故障综合信息, 包括: 确定与所述第一 VNF 实体相关联的 VNF实体发送的故障信息为所述第二故障信息的关联故障信 息; 生成包含有所述第二故障信息和所述关联故障信息的第二故障综合信 息。
结合第二方面及其上述实现方式, 在其第三种实现方式中, 所述 VNFM 根据所述第二故障综合信息进行故障修复或上报处理, 包括: 根据所述第二 故障综合信息中的第二故障信息的故障类型或者所述关联故障信息的故障 类型, 确定所述 VNFM是否包含与所述第二故障信息的故障类型或者所述 关联故障信息的故障类型相对应的故障修复策略; 在所述 VNFM 包含与所 述第二故障信息的故障类型或者所述关联故障信息的故障类型相对应的故 障修复策略时, 根据所述故障修复策略修复所述第一 VNF实体和 /或与所述 第一 VNF实体相关联的 VNF实体的故障; 或者在所述 VNFM不包含与所 述第二故障信息的故障类型或者所述关联故障信息的故障类型相对应的故 障修复策略时, 向编排器发送所述第二故障综合信息。
结合第二方面及其上述实现方式, 在其第四种实现方式中, 所述根据所 述故障修复策略修复所述第一 VNF实体和 /或与所述第一 VNF实体相关联的 VNF实体的故障之后, 还包括: 在所述故障修复成功时, 向所述编排器发送 成功指示消息; 在所述故障修复失败时, 向所述编排器发送所述第二故障综 合信息。
结合第二方面及其上述实现方式, 在其第五种实现方式中, 所述向所述 编排器发送所述第二故障综合信息之前, 还包括: 向虚拟化基础设施管理器 VIM请求与所述第一 VNF实体相关联的 NFVI实体的故障信息, 其中所述 NFVI实体为所述 NFVI中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟 机管理器或虚拟机 VM实体; 将所述与所述第一 VNF实体相关联的 NFVI 实体的故障信息加入所述第二故障综合信息。
结合第二方面及其上述实现方式, 在其第六种实现方式中, 所述方法还 包括: 接收 VIM发送的第一故障综合信息, 所述第一故障综合信息包含所 述第一故障信息和所述第一故障信息的关联故障信息, 所述第一故障信息用 于指示第一 NFVI实体发生故障;确定所述 VNFM是否包含与所述第一故障 综合信息中的第一故障信息的故障类型或者所述关联故障信息的故障类型 相对应的故障修复策略; 在所述 VNFM 包含与所述第一故障信息的故障类 型或者所述关联故障信息的故障类型相对应的故障修复策略时,根据所述故 障修复策略修复所述第一 NFVI 实体和 /或与所述第一 NFVI 实体相关联的 NFVI实体的故障; 或者在所述 VNFM不包含与所述第一故障信息的故障类 型或者所述关联故障信息的故障类型相对应的故障修复策略时, 向编排器发 送所述第一故障综合信息, 或者向所述 VIM发送用于指示所述 VNFM无法 处理所述第一故障综合信息的指示消息, 以便于所述 VIM向所述编排器发 送所述第一故障综合信息。
结合第二方面及其上述实现方式, 在其第七种实现方式中, 所述接收 VIM发送的第一故障综合信息之后,还包括: 根据所述第一故障综合信息确 定与所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体相 关联的所述第一 VNF实体的故障信息; 将所述第一 VNF实体的故障信息加 入所述第一故障综合信息, 以便于所述所述 VNFM对所述第一故障综合信 息进行修复或上报处理。
结合第二方面及其上述实现方式, 在其第八种实现方式中, 所述 VNFM 根据所述第二故障综合信息进行修复或上报处理之后, 还包括: 根据所述第 二故障综合信息检测所述 VNFM是否包含与所述第二故障综合信息相同的 故障综合信息; 在所述 VNFM 包含与所述第二故障综合信息相同的故障综 合信息时, 删除所述第二故障综合信息。
结合第二方面及其上述实现方式, 在其第九种实现方式中, 所述方法还 包括: 接收所述 VIM发送的请求信息, 所述请求信息用于向所述 VNFM请 求与发生故障的 NFVI实体相关联的 VNF实体的故障信息; 向所述 VIM发 送所述与发生故障的 NFVI实体相关联的 VNF实体的故障信息。
结合第二方面及其上述实现方式, 在其第十种实现方式中, 所述第二故 障信息还被用于向运营和业务支撑系统 OSS/BSS上报,以便于所述 OSS/BSS 监控并呈现所述第二故障信息。
结合第二方面及其上述实现方式, 在其第十一种实现方式中, 所述第二 故障信息还包括以下至少一项: 运行状态、 故障时间; 所述第二故障综合信 息还包括故障状态信息, 所述故障状态包含未处理, 处理中, 已修复和未修 复中的至少一种。
第三方面, 提供了一种故障管理方法, 包括: 编排器接收虚拟化基础设 施管理器 VIM发送的第一故障综合信息, 其中, 所述第一故障综合信息包 括第一故障信息, 所述第一故障信息包含故障实体标识和故障类型, 所述第 一故障信息用于指示具有所述故障实体标识的第一网络功能虚拟化基础设 施 NFVI实体发生故障; 所述编排器根据所述第一故障综合信息进行故障修 复或上 处理。
结合第三方面,在其第一种实现方式中,所述第一故障综合信息还包括: 与所述第一 NFVI 实体相关联的 NFVI 实体的故障信息; 和 /或与所述第一 NFVI实体相关联的虚拟网络功能 VNF实体的故障信息。
结合第三方面及其上述实现方式, 在其第二种实现方式中, 所述编排器 根据所述第一故障综合信息进行故障修复或上报处理, 包括: 根据所述第一 故障综合信息中的故障类型,确定所述编排器是否包含与所述故障类型相对 应的故障修复策略; 在所述编排器包含与所述故障类型相对应的故障修复策 略时, 根据所述故障修复策略修复所述第一 NFVI 实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者在所述编排器不包含与所述故 障类型相对应的故障修复策略时, 向运营和业务支撑系统 OSS/BSS发送所 述第一故障综合信息。
结合第三方面及其上述实现方式, 在其第三种实现方式中, 所述编排器 根据所述第一故障综合信息进行故障修复或上报处理, 包括: 根据所述第一 故障综合信息中的故障类型,确定所述编排器是否包含与所述故障类型相对 应的故障修复策略; 在所述编排器包含与所述故障类型相对应的故障修复策 略时, 根据所述故障修复策略修复所述第一 NFVI实体和与所述第一 NFVI 实体相关联的 NFVI实体的故障和与所述第一 NFVI实体相关联的 VNF实体 的故障; 或者在所述编排器不包含与所述故障类型相对应的故障修复策略 时, 向 OSS/BSS发送所述第一故障综合信息。
结合第三方面及其上述实现方式, 在其第四种实现方式中, 所述编排器 根据所述第一故障综合信息进行故障修复或上报处理之前, 还包括: 根据所 述第一故障综合信息检测所述编排器是否包含与所述第一故障综合信息相 同的故障综合信息; 在所述编排器包含与所述第一故障综合信息相同的故障 综合信息时, 删除所述第一故障综合信息。
结合第三方面及其上述实现方式, 在其第五种实现方式中, 所述第一故 障信息还包括以下至少一项: 运行状态、 故障时间; 所述第一故障综合信息 还包括故障状态信息, 所述故障状态包含未处理, 处理中, 已修复和未修复 中的至少一种。
第四方面, 提供了一种故障管理方法, 包括: 编排器接收虚拟网络功能 管理器 VNFM发送的第二故障综合信息, 其中, 所述第二故障综合信息包 括第二故障信息, 所述第二故障信息包含故障实体标识和故障类型, 所述第 二故障信息用于指示具有所述故障实体标识的第一虚拟网络功能 VNF实体 发生故障; 所述编排器根据所述第二故障综合信息进行故障修复或上报处 理。
结合第四方面,在其第一种实现方式中,所述第二故障综合信息还包括: 与所述第一 VNF实体相关联的 VNF实体的故障信息;和 /或与所述第一 VNF 实体相关联的虚拟化基础设施管理 NFVI实体的故障信息。
结合第四方面及其上述实现方式, 在其第二种实现方式中, 所述编排器 根据所述第二故障综合信息进行故障修复或上报处理, 包括: 根据所述第二 故障综合信息中的故障类型,确定所述编排器是否包含与所述故障类型相对 应的故障修复策略; 在所述编排器包含与所述故障类型相对应的故障修复策 略时,根据所述故障修复策略修复所述第一 VNF实体和 /或与所述第一 VNF 实体相关联的 VNF实体的故障; 或者在所述编排器不包含与所述故障类型 相对应的故障修复策略时, 向运营和业务支撑系统 OSS/BSS发送所述第二 故障综合信息。
结合第四方面及其上述实现方式, 在其第三种实现方式中, 所述编排器 根据所述第二故障综合信息进行故障修复或上报处理, 包括: 根据所述第二 故障综合信息中的故障类型,确定所述编排器是否包含与所述故障类型相对 应的故障修复策略; 在所述编排器包含与所述故障类型相对应的故障修复策 略时,根据所述故障修复策略修复所述第一 VNF实体和与所述第一 VNF实 体相关联的 VNF实体的故障和与所述第一 VNF实体相关联的 NFVI实体的 故障; 或者在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向 OSS/BSS发送所述第二故障综合信息。
结合第四方面及其上述实现方式, 在其第四种实现方式中, 所述编排器 根据所述第二故障综合信息进行故障修复或上报处理之前, 还包括: 根据所 述第二故障综合信息检测所述编排器是否包含与所述第二故障综合信息相 同的故障综合信息; 在所述编排器包含与所述第二故障综合信息相同的故障 综合信息时, 删除所述第二故障综合信息。
结合第四方面及其上述实现方式, 在其第五种实现方式中, 所述第二故 障信息还包括以下至少一项: 运行状态、 故障时间; 所述第二故障综合信息 还包括故障状态信息, 所述故障状态包含未处理, 处理中, 已修复和未修复 中的至少一种。
第五方面, 提供了一种虚拟化基础设施管理器, 包括: 获取单元, 用于 获取网络功能虚拟化基础设施 NFVI实体的包含故障实体标识和故障类型的 第一故障信息, 所述第一故障信息用于指示具有所述故障实体标识的第一 NFVI 实体发生故障; 生成单元, 用于根据所述第一故障信息生成第一故障 综合信息, 所述第一故障综合信息包含所述第一故障信息和所述第一故障信 息的关联故障信息; 处理单元, 用于所述 VIM根据所述第一故障综合信息 进行故障修复或上报处理。
结合第五方面, 在其第一种实现方式中, 所述管理器还包括确定单元和 接收单元, 所述获取单元具体用于: 通过所述接收单元接收所述第一 NFVI 实体发送的所述第一故障信息; 或者通过所述确定单元确定所述第一 NFVI 实体发生故障, 并根据所述第一 NFVI实体发生的故障生成所述第一故障信 息。
结合第五方面及其上述实现方式, 在其第二种实现方式中, 所述第一 NFVI实体为所述 NFVI实体中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟机管理器或虚拟机 VM实体, 所述生成单元具体用于: 通过所述确定单 元确定与所述第一 NFVI实体相关联的 NFVI实体发送的故障信息为所述第 一故障信息的关联故障信息; 生成包含有所述第一故障信息和所述关联故障 信息的第一故障综合信息。
结合第五方面及其上述实现方式, 在其第三种实现方式中, 所述处理单 元包括发送单元, 所述处理单元具体用于: 根据所述第一故障综合信息中的 第一故障信息的故障类型或者所述关联故障信息的故障类型,通过所述确定 单元确定所述 VIM是否包含与所述第一故障信息的故障类型或者所述关联 故障信息的故障类型相对应的故障修复策略; 在所述 VIM 包含与所述第一 故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障修复 策略时, 根据所述故障修复策略修复所述第一 NFVI 实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者在所述 VIM不包含与所述第一 故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障修复 策略时, 通过所述发送单元向 VNFM发送所述第一故障综合信息或者向编 排器发送所述第一故障综合信息。
结合第五方面及其上述实现方式, 在其第四种实现方式中, 所述处理单 元具体用于: 通过所述确定单元在所述第一 NFVI实体和与所述第一 NFVI 实体相关联的 NFVI实体中确定优先级最高的 NFVI实体, 其中, HW的优 先级高于 Host OS的优先级, Host OS的优先级高于虚拟机管理器的优先级, 虚拟机管理器的优先级高于 VM的优先级; 根据所述优先级最高的 NFVI实 体的故障类型, 通过所述确定单元确定所述 VIM是否包含相对应的故障修 复策略;在所述 VIM包含与所述优先级最高的 NFVI实体的故障类型相对应 的故障修复策略时, 根据所述故障修复策略修复所述优先级最高的 NFVI实 体的故障。
结合第五方面及其上述实现方式, 在其第五种实现方式中, 所述发送单 元具体用于: 在所述故障修复成功时, 向所述编排器发送成功指示消息; 在 所述故障修复失败时, 向所述 VNFM发送所述第一故障综合信息或者向所 述编排器发送所述第一故障综合信息。
结合第五方面及其上述实现方式, 在其第六种实现方式中, 所述接收单 元还用于: 接收所述 VNFM发送的用于指示所述 VNFM无法处理所述第一 故障综合信息的指示消息; 所述发送单元还用于: 向编排器发送所述第一故 障综合信息。
结合第五方面及其上述实现方式, 在其第七种实现方式中, 所述处理单 元还用于: 向 VNFM请求与所述第一 NFVI实体相关联的 VNF实体的故障 信息; 将所述与所述第一 NFVI实体相关联的 VNF实体的故障信息加入所 述第一故障综合信息。
结合第五方面及其上述实现方式, 在其第八种实现方式中, 所述接收单 元还用于:接收所述 VNFM发送的请求信息,所述请求信息用于向所述 VIM 请求与发生故障的 VNF实体相关联的 NFVI实体的故障信息; 所述发送单 元还用于向所述 VNFM发送所述与发生故障的 VNF实体相关联的 NFVI实 体的故障信息。
结合第五方面及其上述实现方式, 在其第九种实现方式中, 所述管理器 还包括检测单元和删除单元, 所述检测单元具体用于: 根据所述第一故障综 合信息检测所述 VIM是否包含与所述第一故障综合信息相同的故障综合信 息; 所述删除单元具体用于在所述 VIM 包含与所述第一故障综合信息相同 的故障综合信息时, 删除所述第一故障综合信息。
第六方面, 提供了一种虚拟网络功能管理器, 包括: 获取单元, 用于 获取虚拟网络功能 VNF实体的包含故障实体标识和故障类型的第二故障信 息, 所述第二故障信息用于指示具有所述故障实体标识的第一 VNF实体发 生故障; 生成单元, 用于根据所述第二故障信息生成第二故障综合信息; 处 理单元, 用于根据所述第二故障综合信息进行故障修复或上报处理。
结合第六方面, 在其第一种实现方式中, 所述管理器还包括确定单元和 接收单元, 所述获取单元具体用于: 通过所述接收单元接收所述第一 VNF 实体发送的所述第二故障信息; 或者通过所述确定单元确定所述第一 VNF 实体发生故障, 并根据所述第一 VNF实体发生的故障通过所述生成单元生 成所述第二故障信息。
结合第六方面及其上述实现方式, 在其第二种实现方式中, 所述生成单 元具体用于: 通过所述确定单元确定与所述第一 VNF实体相关联的 VNF实 体发送的故障信息为所述第二故障信息的关联故障信息; 生成包含有所述第 二故障信息和所述关联故障信息的第二故障综合信息。
结合第六方面及其上述实现方式, 在其第三种实现方式中, 所述处理单 元包括发送单元, 所述处理单元具体用于: 根据所述第二故障综合信息中的 第二故障信息的故障类型或者所述关联故障信息的故障类型,通过所述确定 单元确定所述 VNFM是否包含与所述第二故障信息的故障类型或者所述关 联故障信息的故障类型相对应的故障修复策略; 在所述 VNFM 包含与所述 第二故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障 修复策略时, 根据所述故障修复策略修复所述第一 VNF实体和 /或与所述第 一 VNF实体相关联的 VNF实体的故障; 或者在所述 VNFM不包含与所述 第二故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障 修复策略时, 通过所述发送单元向编排器发送所述第二故障综合信息。
结合第六方面及其上述实现方式, 在其第四种实现方式中, 所述发送单 元具体用于: 在所述故障修复成功时, 向所述编排器发送成功指示消息; 在 所述故障修复失败时, 向所述编排器发送所述第二故障综合信息。
结合第六方面及其上述实现方式, 在其第五种实现方式中, 所述处理单 元还用于: 向虚拟化基础设施管理器 VIM请求与所述第一 VNF实体相关联 的 NFVI实体的故障信息, 其中所述 NFVI实体为所述 NFVI中的任意一个 硬件 HW、 主操作系统 Host OS、 虚拟机管理器或虚拟机 VM实体; 将所述 与所述第一 VNF实体相关联的 NFVI实体的故障信息加入所述第二故障综 合信息。
结合第六方面及其上述实现方式, 在其第六种实现方式中, 所述处理单 元还用于: 接收 VIM发送的第一故障综合信息, 所述第一故障综合信息包 含所述第一故障信息和所述第一故障信息的关联故障信息, 所述第一故障信 息用于指示第一 NFVI实体发生故障;确定所述 VNFM是否包含与所述第一 故障综合信息中的第一故障信息的故障类型或者所述关联故障信息的故障 类型相对应的故障修复策略; 在所述 VNFM 包含与所述第一故障信息的故 障类型或者所述关联故障信息的故障类型相对应的故障修复策略时,根据所 述故障修复策略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联 的 NFVI实体的故障;或者在所述 VNFM不包含与所述第一故障信息的故障 类型或者所述关联故障信息的故障类型相对应的故障修复策略时, 向编排器 发送所述第一故障综合信息, 或者向所述 VIM发送用于指示所述 VNFM无 法处理所述第一故障综合信息的指示消息, 以便于所述 VIM向所述编排器 发送所述第一故障综合信息。
结合第六方面及其上述实现方式, 在其第七种实现方式中, 所述处理单 元还具体用于: 根据所述第一故障综合信息确定与所述第一 NFVI 实体和 / 或与所述第一 NFVI实体相关联的 NFVI实体相关联的所述第一 VNF实体的 故障信息; 将所述第一 VNF实体的故障信息加入所述第一故障综合信息, 以便于所述所述 VNFM对所述第一故障综合信息进行修复或上报处理。
结合第六方面及其上述实现方式, 在其第八种实现方式中, 所述管理器 还包括检测单元和删除单元, 所述检测单元具体用于: 根据所述第二故障综 合信息检测所述 VNFM是否包含与所述第二故障综合信息相同的故障综合 信息; 所述删除单元具体用于在所述 VNFM 包含与所述第二故障综合信息 相同的故障综合信息时, 删除所述第二故障综合信息。 结合第六方面及其上述实现方式, 在其第九种实现方式中, 所述接收单 元还用于:接收所述 VIM发送的请求信息,所述请求信息用于向所述 VNFM 请求与发生故障的 NFVI实体相关联的 VNF实体的故障信息; 所述发送单 元还用于: 向所述 VIM发送所述与发生故障的 NFVI实体相关联的 VNF实 体的故障信息。
第七方面, 提供了一种编排器, 包括: 接收单元, 用于接收虚拟化基础 设施管理器 VIM发送的第一故障综合信息, 其中, 所述第一故障综合信息 包括第一故障信息, 所述第一故障信息包含故障实体标识和故障类型, 所述 第一故障信息用于指示具有所述故障实体标识的第一网络功能虚拟化基础 设施 NFVI实体发生故障; 处理单元, 用于根据所述第一故障综合信息进行 故障修复或上报处理。
结合第七方面,在其第一种实现方式中,所述第一故障综合信息还包括: 与所述第一 NFVI 实体相关联的 NFVI 实体的故障信息; 和 /或与所述第一 NFVI实体相关联的虚拟网络功能 VNF实体的故障信息。
结合第七方面及其上述实现方式, 在其第二种实现方式中, 所述处理单 元具体用于: 根据所述第一故障综合信息中的故障类型, 确定所述编排器是 否包含与所述故障类型相对应的故障修复策略; 在所述编排器包含与所述故 障类型相对应的故障修复策略时, 根据所述故障修复策略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者在 所述编排器不包含与所述故障类型相对应的故障修复策略时, 向运营和业务 支撑系统 OSS/BSS发送所述第一故障综合信息。
结合第七方面及其上述实现方式, 在其第三种实现方式中, 所述处理单 元具体用于: 根据所述第一故障综合信息中的故障类型, 确定所述编排器是 否包含与所述故障类型相对应的故障修复策略; 在所述编排器包含与所述故 障类型相对应的故障修复策略时, 根据所述故障修复策略修复所述第一 NFVI实体和与所述第一 NFVI实体相关联的 NFVI实体的故障和与所述第一 NFVI实体相关联的 VNF实体的故障;或者在所述编排器不包含与所述故障 类型相对应的故障修复策略时, 向 OSS/BSS发送所述第一故障综合信息。
结合第七方面及其上述实现方式, 在其第四种实现方式中, 所述编排器 还包括检测单元和删除单元, 所述检测单元用于: 根据所述第一故障综合信 息检测所述编排器是否包含与所述第一故障综合信息相同的故障综合信息; 所述删除单元用于在所述编排器包含与所述第一故障综合信息相同的故障 综合信息时, 删除所述第一故障综合信息。
第八方面, 提供了一种编排器, 包括: 接收单元, 用于接收虚拟网络功 能管理器 VNFM发送的第二故障综合信息, 其中, 所述第二故障综合信息 包括第二故障信息, 所述第二故障信息包含故障实体标识和故障类型, 所述 第二故障信息用于指示具有所述故障实体标识的第一虚拟网络功能 VNF实 体发生故障; 处理单元, 用于根据所述第二故障综合信息进行故障修复或上 报处理。
结合第八方面,在其第一种实现方式中,所述第二故障综合信息还包括: 与所述第一 VNF实体相关联的 VNF实体的故障信息;和 /或与所述第一 VNF 实体相关联的虚拟化基础设施管理 NFVI实体的故障信息。
结合第八方面及其上述实现方式, 在其第二种实现方式中, 所述处理单 元具体用于: 根据所述第二故障综合信息中的故障类型, 确定所述编排器是 否包含与所述故障类型相对应的故障修复策略; 在所述编排器包含与所述故 障类型相对应的故障修复策略时,根据所述故障修复策略修复所述第一 VNF 实体和 /或与所述第一 VNF实体相关联的 VNF实体的故障;或者在所述编排 器不包含与所述故障类型相对应的故障修复策略时, 向运营和业务支撑系统 OSS/BSS发送所述第二故障综合信息。
结合第八方面及其上述实现方式, 在其第三种实现方式中, 所述处理单 元具体用于: 根据所述第二故障综合信息中的故障类型, 确定所述编排器是 否包含与所述故障类型相对应的故障修复策略; 在所述编排器包含与所述故 障类型相对应的故障修复策略时,根据所述故障修复策略修复所述第一 VNF 实体和与所述第一 VNF实体相关联的 VNF实体的故障和与所述第一 VNF 实体相关联的 NFVI实体的故障; 或者在所述编排器不包含与所述故障类型 相对应的故障修复策略时, 向 OSS/BSS发送所述第二故障综合信息。
结合第八方面及其上述实现方式, 在其第四种实现方式中, 所述编排器 还包括检测单元和删除单元, 所述检测单元用于: 根据所述第二故障综合信 息检测所述编排器是否包含与所述第二故障综合信息相同的故障综合信息; 所述删除单元用于在所述编排器包含与所述第二故障综合信息相同的故障 综合信息时, 删除所述第二故障综合信息。
本发明实施例提供了一种故障管理方法, 通过 VIM和 VNFM获取硬件 和 /或软件实体的故障信息,对具有关联关系的故障信息进行综合处理,从而 能够实现能够实现 NFV环境下的故障上报及处理。 附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对本发明实施例中 所需要使用的附图作筒单地介绍, 显而易见地, 下面所描述的附图仅仅是本 发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的 前提下, 还可以根据这些附图获得其他的附图。
图 1是本发明网络功能虚拟化 NFV的系统架构图。
图 2是本发明一个实施例的故障管理的方法的流程图。
图 3是本发明一个实施例的故障管理的方法的流程图。
图 4是本发明一个实施例的故障管理的方法的流程图。
图 5是本发明一个实施例的故障管理的方法的流程图。
图 6a是本发明一个实施例的故障管理的方法的交互图。
图 6b是本发明一个实施例的实体之间的关联关系的示意图。
图 7是本发明另一实施例的故障管理的方法的交互图。
图 8是本发明另一实施例的故障管理的方法的交互图。
图 9是本发明另一实施例的故障管理的方法的交互图。
图 10是本发明另一实施例的故障管理的方法的交互图。
图 11 是本发明一个实施例的虚拟化基础设施管理 VIM 实体的示意框 图。
图 12是本发明一个实施例的虚拟网络功能管理 VNFM 实体的示意框 图。
图 13是本发明一个实施例的编排器 Orchestrator实体的示意框图。 图 14是本发明另一实施例的 VIM实体的示意框图。
图 15是本发明另一实施例的 VNFM实体的示意框图。
图 16是本发明另一实施例的 Orchestrator实体的示意框图。 具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例是本发明的一部分实施例, 而不 是全部实施例。 基于本发明中的实施例, 本领域普通技术人员在没有做出创 造性劳动的前提下所获得的所有其他实施例, 都应属于本发明保护的范围。
图 1是本发明网络功能虚拟化 NFV的系统架构图。
在网络功能虚拟化 ( Network Function Virtualization, NFV )端到端( End to End , E2E ) 架构中, 网络功能虚拟化基础设施 ( Network Function Virtualization Infrastructure, NFVI ) 包含底层硬件 ( Hardware, HW ) 资源, 具体可分为计算硬件、 存储硬件、 网络硬件等等。 硬件层之上为虚拟化层 ( Virtualization Layer ), 包括主操作系统 ( Host Operating System, Host OS ) 和超级管理程序 /虚拟机管理器( Hypervisor ) ,在虚拟化层之上运行有多个虚 拟机( Virtual Machine, VM )。 其中, HW和 Hypervisor通过网元管理系统 ( Element Management System, EMS )与运营和业务支撑系统 ( Operation and Business Support System, OSS/BSS )相连。 在 NFVI之上有多个网络虚拟功 能 ( Virtual Network Function, VNF ) 实例通过 vEMS与 OSS/BSS相连。
NFVI 通过 Nf- Vi 接口与虚拟化基础设施管理器 ( Virtualization Infrastructure Manager , VIM )相连, VNF通过 Ve-Vnfm接口与 VNF管理器 ( VNFM )相连, VIM与 VNFM之间通过 Vi-Vnfm接口相连。 NFVI通过 Or-Vi与编排器 Orchestrator相连, VNFM通过 Or-Vnfm与 Orchestrator相连, Orchestrator通过 Os-Ma接口与 OSS/BSS相连。
其中, OSS/BSS用于向 Orchestrator发起服务请求, Orchestrator负责根 据 OSS/BSS服务请求, 编排管理资源, 实现 NFV服务, 并实时检测 VNF、 NFVI资源及运行状态信息。 VNFM负责 VNF生命周期管理, 如启动、 生存 时间、 检测收集 VNF的运行状态信息。 VIM负责管理、 分配 NFVI的资源, 检测收集 NFVI运行状态信息。
图 2是本发明一个实施例的故障管理的方法的流程图。 图 2 的方法由 VIM执行。
201 , 虚拟化基础设施管理器 VIM获取网络功能虚拟化基础设施 NFVI 实体的包含故障实体标识和故障类型的第一故障信息, 第一故障信息用于指 示具有故障实体标识的第一 NFVI实体发生故障。
202, VIM根据第一故障信息生成第一故障综合信息, 第一故障综合信 息包含第一故障信息和第一故障信息的关联故障信息。
203, VIM根据第一故障综合信息进行故障修复或上报处理。 本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。
可选地, 作为一个实施例, 步骤 201包括: 接收第一 NFVI实体发送的 第一故障信息; 或者确定第一 NFVI实体发生故障, 并根据第一 NFVI实体 发生的故障生成第一故障信息。也就是说, VIM可以被动接受发生故障的实 体的故障信息, 也可以在检测到故障后主动生成故障信息。
可选地, 作为一个实施例, 第一 NFVI实体为 NFVI实体中的任意一个 硬件 HW、主操作系统 Host OS、虚拟机管理器或虚拟机 VM实体, 步骤 202 包括: 确定与第一 NFVI实体相关联的 NFVI实体发送的故障信息为第一故 障信息的关联故障信息; 生成包含有第一故障信息和关联故障信息的第一故 障综合信息。 由于某些 HW、 Host OS、 Hypervisor和 VM实体之间具有关联 关系, 因此其中的第一 NFVI实体发生故障时, 可能与第一 NFVI具有关联 关系的其他 NFVI实体也发生故障。 VIM可以收集所有相关的故障信息以便 于进行统一地综合处理。
可选地, 作为一个实施例, 步骤 203, 包括: 根据第一故障综合信息中 的第一故障信息的故障类型或者关联故障信息的故障类型, 确定 VIM是否 包含与第一故障信息的故障类型或者关联故障信息的故障类型相对应的故 障修复策略; 在 VIM 包含与第一故障信息的故障类型或者关联故障信息的 故障类型相对应的故障修复策略时, 根据故障修复策略修复第一 NFVI实体 和 /或与第一 NFVI实体相关联的 NFVI实体的故障; 或者在 VIM不包含与 第一故障信息的故障类型或者关联故障信息的故障类型相对应的故障修复 策略时, 向 VNFM发送第一故障综合信息或者向编排器发送第一故障综合 信息。
具体地, 生成故障综合信息后, VIM首先需要确定 VIM本地是否能够 处理该故障综合信息。 如果能够处理则对故障综合信息所涉及的 NFVI实体 中的一个进行故障修复。 如果无法处理或者修复失败则进行上报处理。
可选地, 作为一个实施例, 根据第一故障综合信息中的第一故障信息的 故障类型或者关联故障信息的故障类型, 确定 VIM是否包含与第一故障信 息的故障类型或者关联故障信息的故障类型相对应的故障修复策略, 包括: 在第一 NFVI实体和与第一 NFVI实体相关联的 NFVI实体中确定优先级最 高的 NFVI实体, 其中, HW的优先级高于 Host OS的优先级, Host OS的 优先级高于虚拟机管理器的优先级,虚拟机管理器的优先级高于 VM的优先 级;根据优先级最高的 NFVI实体的故障类型确定 VIM是否包含相对应的故 障修复策略;在 VIM包含与优先级最高的 NFVI实体的故障类型相对应的故 障修复策略时, 根据故障修复策略修复优先级最高的 NFVI实体的故障。
可选地, 作为一个实施例, 根据故障修复策略修复第一 NFVI 实体和 / 或与第一 NFVI实体相关联的 NFVI实体的故障之后, 还可以包括: 在故障 修复成功时, 向编排器发送成功指示消息; 在故障修复失败时, 向 VNFM 发送第一故障综合信息或者向编排器发送第一故障综合信息。 其中, 成功指 示消息可以是运行状态置为 "正常" 的故障信息, 也可以是其他形式的用于 指示修复成功的消息。 本发明对此不作限定。
可选地, 作为一个实施例, 向 VNFM发送第一故障综合信息之后, 还 包括: 接收 VNFM发送的用于指示 VNFM无法处理第一故障综合信息的指 示消息; 向编排器发送第一故障综合信息。 在 VIM无法处理第一故障综合 信息并上报给 VNFM的情况下, 如果 VNFM也无法处理, 则 VNFM继续将 第一故障综合信息上报给 Orchestrator。
可选地, 作为一个实施例, 向编排器发送第一故障综合信息之前, 还包 括: 向 VNFM请求与第一 NFVI实体相关联的 VNF实体的故障信息; 将与 第一 NFVI实体相关联的 VNF实体的故障信息加入第一故障综合信息。 在 VIM无法处理第一故障综合信息, 或者修复失败时, 可以向 VNFM发起请 求, 以获取与发生故障的 NFVI实体相关联的 VNF实体的故障信息, 综合 上报, 以便于上层管理实体能够进行综合处理。
可选地, 作为一个实施例, 该方法还包括: 接收 VNFM发送的请求信 息, 请求信息用于向 VIM请求与发生故障的 VNF实体相关联的 NFVI实体 的故障信息; 向 VNFM发送与发生故障的 VNF实体相关联的 NFVI实体的 故障信息。 具体地, 在 VNFM无法处理 VNF实体的故障综合信息时同样可 以向 VIM请求相关的 NFVI故障信息, 综合上报, 以便于上层管理实体能够 进行综合处理。
可选地,作为一个实施例, VIM根据第一故障信息生成第一故障综合信 息之后, 还包括: 根据第一故障综合信息检测 VIM是否包含与第一故障综 合信息相同的故障综合信息; 在 VIM 包含与第一故障综合信息相同的故障 综合信息时, 删除第一故障综合信息。
具体地, 由于多个具有关联关系的 NFVI实体发生关联性故障时, VIM 会获取到多个相同的故障综合信息,这里的相同指的是故障综合信息中的故 障信息内容相同, 此时, VIM可以进行重复报警检测。 对于正在进行处理的 故障综合信息继续处理, 对于未处理的相同的故障综合信息做删除处理。
可选地, 作为一个实施例, 第一故障信息还被用于向运营和业务支撑系 统 OSS/BSS上报, 以便于 OSS/BSS监控并呈现第一故障信息。
可选地, 作为一个实施例, 第一故障信息还包括以下至少一项: 运行状 态、 故障时间; 第一故障综合信息还包括故障状态信息, 故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行 综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 从 而提高了故障处理的效率和准确度。
图 3是本发明一个实施例的故障管理的方法的流程图。 图 3 的方法由 VNFM执行。
301 , 虚拟网络功能管理器 VNFM获取虚拟网络功能 VNF实体的包含 故障实体标识和故障类型的第二故障信息, 第二故障信息用于指示具有故障 实体标识的第一 VNF实体发生故障。
302, VNFM根据第二故障信息生成第二故障综合信息。
303, VNFM根据第二故障综合信息进行故障修复或上报处理。
本发明实施例提供的故障管理方法, 通过 VNFM获取硬件和 /或软件实 体的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能 够实现 NFV环境下的故障上报及处理。
可选地, 作为一个实施例, 步骤 301 包括: 接收第一 VNF实体发送的 第二故障信息; 或者确定第一 VNF实体发生故障, 并根据第一 VNF实体发 生的故障生成第二故障信息。 也就是说, VNFM可以被动接受发生故障的实 体的故障信息, 也可以在检测到故障后主动生成故障信息。
可选地, 作为一个实施例, 步骤 302包括: 确定与第一 VNF实体相关 联的 VNF实体发送的故障信息为第二故障信息的关联故障信息; 生成包含 有第二故障信息和关联故障信息的第二故障综合信息。 由于 VNF实体之间 可能具有关联关系,因此其中的第一 VNF实体发生故障时,可能与第一 VNF 具有关联关系的其他 VNF实体也发生故障。 VNFM可以收集所有相关的故 障信息以便于进行统一地综合处理。
可选地, 作为一个实施例, 步骤 303包括: 根据第二故障综合信息中的 第二故障信息的故障类型或者关联故障信息的故障类型, 确定 VNFM是否 包含与第二故障信息的故障类型或者关联故障信息的故障类型相对应的故 障修复策略; 在 VNFM 包含与第二故障信息的故障类型或者关联故障信息 的故障类型相对应的故障修复策略时, 根据故障修复策略修复第一 VNF实 体和 /或与第一 VNF实体相关联的 VNF实体的故障; 或者在 VNFM不包含 与第二故障信息的故障类型或者关联故障信息的故障类型相对应的故障修 复策略时, 向编排器发送第二故障综合信息。
具体地, 生成故障综合信息后, VNFM首先需要确定 VNFM本地是否 能够处理该故障综合信息。 如果能够处理则对故障综合信息所涉及的 VNF 实体中的一个进行故障修复。 如果无法处理或者修复失败则进行上报处理。
可选地, 作为一个实施例, 根据故障修复策略修复第一 VNF实体和 /或 与第一 VNF实体相关联的 VNF实体的故障之后, 还包括: 在故障修复成功 时, 向编排器发送成功指示消息; 在故障修复失败时, 向编排器发送第二故 障综合信息。 其中, 成功指示消息可以是运行状态置为 "正常"的故障信息, 也可以是其他形式的用于指示修复成功的消息。 本发明对此不作限定。
可选地, 作为一个实施例, 向编排器发送第二故障综合信息之前, 还包 括: 向虚拟化基础设施管理器 VIM请求与第一 VNF实体相关联的 NFVI实 体的故障信息, 其中 NFVI实体为 NFVI中的任意一个硬件 HW、 主操作系 统 Host OS、虚拟机管理器或虚拟机 VM实体;将与第一 VNF实体相关联的 NFVI实体的故障信息加入第二故障综合信息。在 VNFM无法处理第二故障 综合信息, 或者修复失败时, 可以向 VIM发起请求, 以获取与发生故障的 VNF实体相关联的 NFVI实体的故障信息, 综合上报, 以便于上层管理实体 能够进行综合处理。
可选地, 作为一个实施例, 该方法还包括: 接收 VIM发送的第一故障 综合信息, 第一故障综合信息包含第一故障信息和第一故障信息的关联故障 信息, 第一故障信息用于指示第一 NFVI实体发生故障; 确定 VNFM是否包 含与第一故障综合信息中的第一故障信息的故障类型或者关联故障信息的 故障类型相对应的故障修复策略; 在 VNFM 包含与第一故障信息的故障类 型或者关联故障信息的故障类型相对应的故障修复策略时,根据故障修复策 略修复第一 NFVI实体和 /或与第一 NFVI实体相关联的 NFVI实体的故障; 或者在 VNFM 不包含与第一故障信息的故障类型或者关联故障信息的故障 类型相对应的故障修复策略时, 向编排器发送第一故障综合信息, 或者向 VIM发送用于指示 VNFM无法处理第一故障综合信息的指示消息, 以便于 VIM向编排器发送第一故障综合信息。 在 VIM无法处理 NFVI实体的第一 故障综合信息, 或者修复失败时, 向 VNFM上报第一故障综合信息, 如果 VNFM也无法处理或者修复失败时, 则通知 VIM, 以便于 VIM将第一故障 综合信息上报给 Orchestrator。
可选地, 作为一个实施例, 接收 VIM发送的第一故障综合信息之后, 还包括: 根据第一故障综合信息确定与第一 NFVI实体和 /或与第一 NFVI实 体相关联的 NFVI实体相关联的第一 VNF实体的故障信息; 将第一 VNF实 体的故障信息加入第一故障综合信息, 以便于 VNFM对第一故障综合信息 进行修复或上报处理。
可选地, 作为一个实施例, VNFM根据第二故障综合信息进行修复或上 报处理之后, 还包括: 根据第二故障综合信息检测 VNFM是否包含与第二 故障综合信息相同的故障综合信息; 在 VNFM 包含与第二故障综合信息相 同的故障综合信息时, 删除第二故障综合信息。
具体地, 由于多个具有关联关系的 VNF实体发生关联性故障时, VNFM 会获取到多个相同的故障综合信息,这里的相同指的是故障综合信息中的故 障信息内容相同, 此时, VNFM可以进行重复报警检测。 对于正在进行处理 的故障综合信息继续处理, 对于未处理的相同的故障综合信息做删除处理。
可选地, 作为一个实施例, 方法还包括: 接收 VIM发送的请求信息, 请求信息用于向 VNFM请求与发生故障的 NFVI实体相关联的 VNF实体的 故障信息; 向 VIM发送与发生故障的 NFVI实体相关联的 VNF实体的故障 信息。
可选地, 作为一个实施例, 第二故障信息还被用于向运营和业务支撑系 统 OSS/BSS上报, 以便于 OSS/BSS监控并呈现第二故障信息。
可选地, 作为一个实施例, 第二故障信息还包括以下至少一项: 运行状 态、 故障时间; 第二故障综合信息还包括故障状态信息, 故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
本发明实施例提供的故障管理方法, 通过 VNFM获取硬件和 /或软件实 体的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能 够实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进 行综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 从而提高了故障处理的效率和准确度。
图 4是本发明一个实施例的故障管理的方法的流程图。 图 4 的方法由 Orchestrator执行。
401 ,编排器接收虚拟化基础设施管理器 VIM发送的第一故障综合信息, 其中, 第一故障综合信息包括第一故障信息, 第一故障信息包含故障实体标 识和故障类型, 第一故障信息用于指示具有故障实体标识的第一网络功能虚 拟化基础设施 NFVI实体发生故障。
402, 编排器根据第一故障综合信息进行故障修复或上报处理。
本发明实施例提供的故障管理方法, 通过 Orchestrator获取硬件和 /或软 件实体的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实 现能够实现 NFV环境下的故障上报及处理。
可选地, 作为一个实施例, 第一故障综合信息还包括: 与第一 NFVI实 体相关联的 NFVI实体的故障信息; 和 /或与第一 NFVI实体相关联的虚拟网 络功能 VNF实体的故障信息。也就是说, Orchestrator从 VIM获取的故障综 合信息可以包含 NFVI实体的故障信息, 也可以包含 NFVI实体以及相关的
VNF实体的故障信息。
可选地, 作为一个实施例, 步骤 402包括: 根据第一故障综合信息中的 故障类型, 确定编排器是否包含与故障类型相对应的故障修复策略; 在编排 器包含与故障类型相对应的故障修复策略时, 根据故障修复策略修复第一
NFVI实体和 /或与第一 NFVI实体相关联的 NFVI实体的故障; 或者在编排 器不包含与故障类型相对应的故障修复策略时, 向运营和业务支撑系统
0SS/BSS发送第一故障综合信息。
可选地, 作为一个实施例, 步骤 402包括: 根据第一故障综合信息中的 故障类型, 确定编排器是否包含与故障类型相对应的故障修复策略; 在编排 器包含与故障类型相对应的故障修复策略时, 根据故障修复策略修复第一 NFVI实体和与第一 NFVI实体相关联的 NFVI实体的故障和与第一 NFVI实 体相关联的 VNF实体的故障; 或者在编排器不包含与故障类型相对应的故 障修复策略时, 向 OSS/BSS发送第一故障综合信息。
可选地, 作为一个实施例, 步骤 402之前, 还包括: 根据第一故障综合 信息检测编排器是否包含与第一故障综合信息相同的故障综合信息; 在编排 器包含与第一故障综合信息相同的故障综合信息时, 删除第一故障综合信 息。 具体地, 由于多个具有关联关系的 NFVI实体或 VNF实体发生关联性 故障时, Orchestrator会获取到多个相同的故障综合信息,这里的相同指的是 故障综合信息中的故障信息内容相同,此时, Orchestrator可以进行重复报警 检测。 对于正在进行处理的故障综合信息继续处理, 对于未处理的相同的故 障综合信息做删除处理。
可选地, 作为一个实施例, 第一故障信息还包括以下至少一项: 运行状 态、 故障时间; 第一故障综合信息还包括故障状态信息, 故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
本发明实施例提供的故障管理方法, 通过 Orchestrator接收 VIM上报的 故障综合信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能 够实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进 行综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 从而提高了故障处理的效率和准确度。
图 5是本发明一个实施例的故障管理的方法的流程图。 图 5 的方法由
Orchestrator执行。
501 ,编排器接收虚拟网络功能管理器 VNFM发送的第二故障综合信息, 其中, 第二故障综合信息包括第二故障信息, 第二故障信息包含故障实体标 识和故障类型, 第二故障信息用于指示具有故障实体标识的第一虚拟网络功 能 VNF实体发生故障。
502, 编排器根据第二故障综合信息进行故障修复或上报处理。
本发明实施例提供的故障管理方法, 通过 Orchestrator获取硬件和 /或软 件实体的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实 现能够实现 NFV环境下的故障上报及处理。
可选地, 作为一个实施例, 第二故障综合信息还包括: 与第一 VNF实 体相关联的 VNF实体的故障信息;和 /或与第一 VNF实体相关联的虚拟化基 础设施管理 NFVI实体的故障信息。 也就是说, Orchestrator从 VNFM获取 的故障综合信息可以包含 NFVI实体的故障信息, 可以包含 VNF实体的故 障信息, 也可以包含 NFVI实体以及相关的 VNF实体的故障信息。
可选地, 作为一个实施例, 步骤 502包括: 根据第二故障综合信息中的 故障类型, 确定编排器是否包含与故障类型相对应的故障修复策略; 在编排 器包含与故障类型相对应的故障修复策略时, 根据故障修复策略修复第一 VNF实体和 /或与第一 VNF实体相关联的 VNF实体的故障; 或者在编排器 不包含与故障类型相对应的故障修复策略时, 向运营和业务支撑系统 OSS/BSS发送第二故障综合信息。
可选地, 作为一个实施例, 步骤 502包括: 根据第二故障综合信息中的 故障类型, 确定编排器是否包含与故障类型相对应的故障修复策略; 在编排 器包含与故障类型相对应的故障修复策略时, 根据故障修复策略修复第一 VNF实体和与第一 VNF实体相关联的 VNF实体的故障和与第一 VNF实体 相关联的 NFVI实体的故障; 或者在编排器不包含与故障类型相对应的故障 修复策略时, 向 0SS/BSS发送第二故障综合信息。
可选地, 作为一个实施例, 步骤 502之前还包括: 根据第二故障综合信 息检测编排器是否包含与第二故障综合信息相同的故障综合信息; 在编排器 包含与第二故障综合信息相同的故障综合信息时, 删除第二故障综合信息。 具体地, 由于多个具有关联关系的 NFVI实体或 VNF实体发生关联性故障 时, Orchestrator会获取到多个相同的故障综合信息,这里的相同指的是故障 综合信息中的故障信息内容相同,此时, Orchestrator可以进行重复报警检测。 对于正在进行处理的故障综合信息继续处理,对于未处理的相同的故障综合 信息做删除处理。
可选地, 作为一个实施例, 第二故障信息还包括以下至少一项: 运行状 态、 故障时间; 第二故障综合信息还包括故障状态信息, 故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
本发明实施例提供的故障管理方法, 通过 Orchestrator接收 VNFM上报 的故障综合信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现 能够实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息 进行综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处 理, 从而提高了故障处理的效率和准确度。 图 6a是本发明一个实施例的故障管理的方法的交互图。 图 6a所示的方 法可以由图 1所示的 NFV系统执行。
601 , VIM获取故障信息。
当 VIM检测到 NFVI中的任意 HW、 Host OS、 Hypervisor和 VM发生 故障时, VIM获取发生故障的 NFVI实体的故障信息。 具体地, 获取故障信 息可以是由发生故障的 NFVI实体生成并上报给 VIM的, 也可以是 VIM根 据检测到的故障在本地生成的。
VIM检测 NFVI实体发生故障的方法可以有以下几种方法:
为了方便描述, 以下以第一 NFVI实体发生故障为例进行描述, 该第一 NFVI实体可以为 NFVI中的任意 HW、 Host OS、 Hypervisor和 VM实体。 其中, 实体可以包括硬件实体或软件实体。
方法一,
第一 NFVI实体发生故障时, 第一 NFVI实体生成故障信息, 该故障信 息至少包含用于唯一标识第一 NFVI实体的故障实体标识, 通过该标识可以 唯一确定发生故障的第一 NFVI实体的实际位置或在拓朴关系中的位置。 该 故障信息还包含有故障标识, 用于唯一标识一个故障信息。 该故障信息还包 含有故障类型, 用于表示该故障发生的原因, 例如过载、 断电、 内存泄漏、 端口错误、 无故障等。 此外, 故障信息还可以包含运行状态和故障时间, 运 行状态用于标记第一 NFVI实体当前是否能够正常运行, 故障时间可以用于 记录故障发生的时间。 作为一个例子, 故障信息的格式可以如表一所示: 故障信息
Figure imgf000025_0002
Figure imgf000025_0001
第一 NFVI 生成上述格式的故障信息后, 可以通过 Nf-Vi接口发送给 VIM,可选地,还可以同时通过 EMS将故障信息发送给 0SS/BSS以供管理、 记录、 呈现。
方法二,
VIM可以周期性地或者在需要的时候向第一 NFVI实体发送指示消息, 指示第一 NFVI实体进行故障检测, 第一 NFVI实体如果检测到故障可以向 VIM返回与上述表一相类似的故障信息, 如果第一 NFVI没有故障, 可以不 返回任何消息, 也可以返回故障类型为 "无故障", 运行状态为 "正常" 的 如表一所示的故障信息。
方法三,
第一 NFVI实体可以周期性地向 VIM发送表示第一 NFVI实体运行正常 的心跳指示消息。 VIM则周期性地接收到第一 NFVI实体的心跳, 感知到第 一 NFVI实体工作正常,当第一 NFVI实体心跳中断,则 VIM判定第一 NFVI 实体发生故障。 VIM可以生成第一 NFVI的故障信息, 具体格式与上述表一 的故障信息相类似, 此处不再赘述。
当 NFVI实体发生断电等突然性事故而无法上报故障信息时, VIM依然 能够在第一时间感知到第一 NFVI实体发生故障。
方法四,
VIM可以周期性地或者在需要的时候对 NFVI进行故障检测,之后 VIM 根据故障检测结果生成第一 NFVI的故障信息, 具体格式与上述表一的故障 信息相类似, 此处不再赘述。
综上所述, VIM检测 NFVI实体的故障可以通过以上任意一种方法进行, 当然可以通过多种方法结合进行检测, 例如, 可以将方法一和方法三结合, NFVI实体周期性向 VIM发送心跳, 在发生故障时向 VIM发送故障信息, 如果 NFVI实体发生灾难性故障无法上报故障信息,则 VIM可以通过心跳停 止感知到 NFVI实体发生故障。
602, VIM生成故障综合信息
在 VIM接收到第一 NFVI 实体发送的故障信息, 或者 VIM根据第一 NFVI实体发生的故障生成故障信息后, VIM需要根据收集与第一 NFVI实 体相关联的其他 NFVI实体的故障信息, 以生成故障综合信息, 以便于进行 综合处理。
具体地, 由于 HW、 Host OS, Hypervisor, VM实体之间存在有关联关 系, 因此当第一 NFVI实体发生故障时, 可能与第一 NFVI实体有关联关系 的某些实体也会发生故障。图 6b示例性地示出了 HW、 Host OS, Hypervisor, VM 实体之间的关联关系。 例如, 与 HW1 有关联关系的包括 Host 0S1、 Hypervisorl , VM1和 VM2。 也就是说, 当 HW1发生故障时, 建立在其上 的虚拟化实体 Host 0S1、 Hypervisorl , VM1和 VM2会发生故障。 此时, VIM可以收集 Host OS 1、 Hypervisorl , VM1和 VM2上报的故障信息, 结 合 HW1的故障信息生成故障综合信息。 具体地, 可以生成如表二所示的故 障综合信息:
故障综合信息
Figure imgf000027_0002
Figure imgf000027_0001
其中 HW、 Host OS , Hypervisor和 VM实体的故障信息格式与上述表一 相类似。 故障综合信息标识用于唯一标识一个故障综合信息。 应理解, 表二 所示的故障综合信息为一个具体的例子,故障综合信息具体包含哪些实体的 故障信息根据关联关系而定。其中故障综合信息刚生成时可以将故障状态置 为 "未处理"。
603 , 报警重复检测
VIM生成故障综合信息后, 可以在 VIM本地检测已生成的故障综合信 息, 确定是否存在相同的信息。 具体地, 由于一个 NFVI实体发生故障后, 与之具有关联关系的发生故障的 NFVI实体都会上报故障信息,因此 VIM很 可能就同一个故障生成多个相同的故障综合信息。 例如, HW1 发生故障, 与 HW1具有关联关系的 Host 0S1、 Hypervisorl、 VM1和 VM2也发生故障 并且与 HW1执行相同的操作, VIM在进行关联故障信息收集后会生成多个 同样的故障综合信息, 此时可以只处理其中的一个故障综合信息, 将其他相 同的故障综合信息丟弃。 应理解, 这里的相同的故障综合信息指的是 HW、 Host OS、 Hypervisor和 VM故障信息部分相同,故障标识和故障状态可以不 同。
具体地, 可以通过故障综合信息的故障状态来保留或丟弃故障综合信 息, 例如, 刚生成的故障综合信息的故障状态为 "未处理", 对该故障综合 信息进行 警重复检测, 如果发现故障状态为 "处理中" 的相同的故障综合 信息, 则对未处理的故障综合信息做丟弃处理。 保留即继续执行对故障状态 为 "处理中" 的故障综合信息中的故障的处理。 604, VIM自愈判断
当 VIM中故障综合信息生成, VIM首先可以判断故障综合信息中的故 障类型是否为 VIM能够处理的故障类型。
具体地, VIM中具有故障修复策略,该故障修复策略包括故障实体标识、 故障类型和故障修复方法的映射关系。可以通过判断故障综合信息中的故障 类型是否存在于故障修复策略中而确定是否能够进行处理。 例如, HW1 的 故障类型为 "低性能", 相对应的故障修复方法为 "重启"。
此外,当故障综合信息中包含多个关联的 NFVI实体的故障信息时, VIM 可以根据 NFVI实体的优先级确定针对哪个 NFVI实体的故障信息中的故障 类型进行自愈判断。优先级为: HW高于 Host OS高于 Hypervisor高于 VM。 例如, 如表二所示, 当故障综合信息包含 HW1、 Host 0S1、 Hypervisorl , VM1和 VM2的故障信息时, VIM可以优先处理 HW1的故障, 也就是说, 根据 HW1的故障信息中的故障类型,例如 "低性能",确定故障修复方法"重 启,,。 启硬件设备、 重新加载软件 ( Host OS、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 ( VNF Forwarding Graph )。
605aVIM能够进行自愈处理
如果 VIM判断能够处理,则根据故障修复方法对 NFVI实体进行故障修 复。如果故障修复成功,并且具有关联关系的 NFVI实体的故障都得到修复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
如果故障综合信息包含多个 NFVI实体, 被优先处理的 NFVI实体的故 障修复成功, 但是其他关联的 NFVI实体的故障依然存在, 则重复进行 604 的步骤, 对余下的依然存在故障的 NFVI实体中优先级最高的 NFVI实体进 行判断, 并修复, 直到该故障综合信息中的所有 NFVI实体的故障都得到修 复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
具体地, 对于能够处理的故障综合信息, VIM可以将修复状态置为 "处 理中"以防止对后续生成的相同的 "未处理"的故障综合信息进行重复处理。
修复成功的 NFVI实体可以通过上报运行状态为 "正常" 的类似于表一 的故障信息来通知 VIM故障修复成功。 当故障综合信息中具有关联关系的 所有的 NFVI实体的故障都得到修复, VIM可以将故障综合信息的故障状态 置为 "已修复" 并通过 Or-Vi接口上报 Orchestrator。 应理解, 修复成功也可 以通过预定义的信令进行上报, 本发明对此不做限定。
此外, 可以将正在进行修复的 NFVI实体进行隔离, 以避免该故障体与 相邻的其他实体交互而导致进一步的故障传染。
605bVIM不能够进行自愈处理
如果 VIM中的故障修复策略中不包含待修复的 NFVI实体的故障类型, 则 VIM可以将故障综合信息的故障状态置为 "未修复"并通过 Or-Vi接口上 才艮 Orchestrator„
606, Orchestrator自愈判断
当 Orchestrator接收到 VIM发送的故障综合信息, Orchestrator检测是否 能够进行自愈处理, 与 VIM的自愈判断相类似, Orchestrator查询本地故障 修复策略, 如果能够进行处理且修复成功, 则将故障综合信息中的故障状态 置为 "已修复" 并向 0SS/BSS上报。 如果 Orchestrator不能够进行修复处理 或者能进行修复处理但是修复失败, 则将 NFVI的故障综合信息的故障状态 置为 "未修复" 并向 0SS/BSS上报。 应理解, 由于 Orchestrator负责编排管 理资源, 并实现 NFV服务, 因此 Orchestrator具有较高的管理权限以及处理 能力, 能够修复大部分的故障。 只有极少数的无法处理或者修复失败的故障 才会被上报的 0SS/BSS 启硬件设备、 重新加载软件 ( Host 0S、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 ( VNF Forwarding Graph )。
607 , 0SS/BSS进行故障修复
0SS/BSS将该接收到的故障综合信息的故障状态改为 "处理中"。 然后
0SS/BSS根据故障修复策略中的方法进行故障恢复。 故障恢复后, 0SS/BSS 会收到 NFVI实体发送的故障恢复通知,之后将 0SS/BSS故障综合信息中的 故障状态修改为 "已修复"。 其中 0SS/BSS中的故障修复策略默认包含所有 故障类型的处理方法。
本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行 综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 并 且对于正在处理的故障实体进行隔离处理,从而提高了故障处理的效率和准 确度, 且有效的防止了故障传染。
图 7是本发明另一实施例的故障管理的方法的交互图。 图 7所示的方法 可以由图 1所示的 NFV系统执行。
701 , VNFM获取故障信息。
当 VNFM检测到 VNF中的任意 VNF实体发生故障时, VNFM获取发 生故障的 VNF实体的故障信息。 具体地, 获取故障信息可以是由发生故障 的 VNF实体生成并上报给 VNFM的,也可以是 VNFM根据检测到的故障在 VNFM本地生成的。
VNFM检测 VNF实体发生故障的方法可以有以下几种方法:
为了方便描述, 以下以第一 VNF实体发生故障为例进行描述, 该第一 VNF实体可以为 VNF中的任意 VNF实体。其中, 实体可以包括硬件实体或 软件实体或实例。
方法一,
第一 VNF实体发生故障时, 第一 VNF实体生成故障信息, 该故障信息 至少包含用于唯一标识第一 VNF实体的故障实体标识, 通过该标识可以唯 一确定发生故障的第一 VNF实体的实际位置或在拓朴关系中的位置。 故障 标识用于唯一标识一个故障信息。 该故障信息还包含有故障类型, 用于表示 该故障发生的原因, 例如过载、 断电、 内存泄漏、 端口错误或无故障等。 此 夕卜,故障信息还可以包含运行状态和故障时间,运行状态用于标记第一 VNF 实体当前是否能够正常运行, 故障时间可以用于记录故障发生的时间。 作为 一个例子, 故障信息的格式可以如表三所示:
故障信息
Figure imgf000030_0001
表三
第一 VNF生成上述格式的故障信息后, 可以通过 Ve-Vnfm接口发送给 VNFM, 可选地, 还可以同时通过 vEMS将故障信息发送给 OSS/BSS以供 管理、 记录、 呈现。
方法二,
VNFM可以周期性地或者在需要的时候向第一 VNF实体发送指示消息, 指示第一 VNF 实体进行故障检测, 第一 VNF 实体如果检测到故障可以向 VNFM返回与上述表三相类似的故障信息, 如果第一 VNF没有故障, 可以 不返回任何消息, 也可以返回故障类型为 "无故障", 运行状态为 "正常" 的如表三所示的故障信息。
方法三,
第一 VNF实体可以周期性地向 VNFM发送表示第一 VNF实体运行正 常的心跳指示消息。 VNFM则周期性地接收到第一 VNF实体的心跳, 感知 到第一 VNF实体工作正常, 当第一 VNF实体心跳中断, 则 VNFM判定第 一 VNF实体发生故障。 VNFM可以生成第一 VNF的故障信息, 具体格式与 上述表三的故障信息相类似, 此处不再赘述。
当 VNF实体发生突然性故障而无法上报故障信息时, VNFM依然能够 在第一时间感知到第一 VNF实体发生故障。
方法四,
VNFM 可以周期性地或者在需要的时候对 VNF 进行故障检测, 之后 VNFM根据故障检测结果生成第一 VNF的故障信息, 具体格式与上述表三 的故障信息相类似, 此处不再赘述。
综上所述, VNFM检测 VNF实体的故障可以通过以上任意一种方法进 行, 当然可以通过多种方法结合进行检测, 例如, 可以将方法一和方法三结 合, VNF实体周期性向 VNFM发送心跳, 在发生故障时向 VNFM发送故障 信息, 如果 VNF实体发生灾难性故障无法上报故障信息, 则 VNFM可以通 过心跳停止感知到 VNF实体发生故障。
702, VNFM生成故障综合信息
在 VNFM接收到第一 VNF实体发送的故障信息,或者 VNFM根据第一 VNF实体发生的故障生成故障信息后, VNFM可以根据第一 VNF的故障信 息生成故障综合信息。 可选地, VNFM可以收集与第一 VNF实体相关联的 其他 VNF实体的故障信息, 以生成故障综合信息, 以便于进行综合处理。
具体地, 由于 VNF实体之间存在有关联关系, 因此当第一 VNF实体发 生故障时,往往与第一 VNF实体有关联关系的其他 VNF实体也会发生故障。 图 6b示例性地示出了 VNF实体之间的关联关系。 例如, VNF1与 VNF2都 基于 VM1 , 即 VNF1与 VNF2之间具有关联关系。 当 VNF1发生了故障, VNF2有可能也发生了故障。
此时, VNFM可以收集 VNF1上报的故障信息, 结合 VNF2的故障信息 生成故障综合信息。 具体地, 可以生成如表四所示的故障综合信息:
故障综合信息
Figure imgf000032_0001
表四
其中 VNF1 , VNF2实体的故障信息格式与上述表三相类似。 应理解, 表四所示的故障综合信息为一个具体的例子,故障综合信息具体包含哪些实 体的故障信息根据关联关系而定。其中故障综合信息刚生成时可以将故障状 态置为 "未处理"。
703, 报警重复检测
VNFM生成故障综合信息后, 可以在 VNFM本地检测已生成的故障综 合信息, 确定是否存在相同的信息。 具体地, 由于一个 VNF实体发生故障 后, 与之具有关联关系的发生故障的 VNF 实体都会上报故障信息, 因此 VNFM很可能就同一个故障生成多个相同的故障综合信息。 例如, VNF1发 生故障, 与 VNF1具有关联关系的 VNF2也发生故障并且与 VNF1执行相同 的操作, VNFM 在进行关联故障信息收集后会生成多个同样的故障综合信 息, 此时可以只处理其中的一个故障综合信息, 将其他相同的故障综合信息 丟弃。 应理解, 这里的相同的故障综合信息指的是 VNF状态信息部分相同, 故障状态可以不同。
具体地,可以通过故障综合信息的故障状态来保留或丟丟弃故障综合信 息, 例如, 刚生成的故障综合信息的故障状态为 "未处理", 对该故障综合 信息进行 警重复检测, 如果发现故障状态为 "处理中" 的相同的故障综合 信息, 则对未处理的故障综合信息做丟弃处理。
704, VNFM自愈判断
当 VNFM有故障综合信息生成, VNFM首先可以判断故障综合信息中 的故障类型是否为 VNFM能够处理的故障类型。 具体地, VNFM中具有故障修复策略, 该故障修复策略包括故障实体标 识、 故障类型和故障修复方法的映射关系。 可以通过判断故障综合信息中的 故障类型是否存在于故障修复策略中而确定是否能够进行处理。例如, VNF1 的故障类型为 "低性能", 相对应的故障修复方法为 "增加一个 VNF实例"。 启硬件设备、 重新加载软件 ( Host OS、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 ( VNF Forwarding Graph )。
705aVNFM能够进行自愈处理
如果 VNFM判断能够处理, 则根据故障修复方法对 VNF实体进行故障 修复。 如果故障修复成功, 并且具有关联关系的 VNF实体的故障都得到修 复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
如果故障综合信息包含多个 VNF实体,被优先处理的 VNF实体的故障 修复成功, 但是其他关联的 VNF实体的故障依然存在, 则重复进行 704的 步骤, 对余下的依然存在故障的 VNF实体进行判断, 并修复, 直到该故障 综合信息中的所有 VNF实体的故障都得到修复, 则通知 Orchestrator修复成 功, 并终结该故障修复处理过程。
具体地,对于能够处理的故障综合信息, VNFM可以将修复状态置为 "处 理中"以防止对后续生成的相同的 "未处理"的故障综合信息进行重复处理。
修复成功的 VNF实体可以通过上报运行状态为 "正常" 的类似于表三 的故障信息来通知 VNFM故障修复成功。 当故障综合信息中具有关联关系 的所有的 VNF实体的故障都得到修复, VNFM可以将故障综合信息的故障 状态置为 "已修复" 并通过 Or-Vnfm接口上报 Orchestrator。 应理解, 修复 成功也可以通过预定义的信令进行上报, 本发明对此不做限定。
此外, 可以将正在进行修复的 VNF实体进行隔离, 以避免该故障体与 相邻的其他实体交互而导致进一步的故障传染。
705b VNFM不能够进行自愈处理
如果 VNFM 中的故障修复策略中不包含待修复的 VNF 实体的故障类 型,则 VNFM可以将故障综合信息的故障状态置为"未修复"并通过 Or-Vnfm 接口上才艮 Orchestrator。
706, Orchestrator自愈判断 当 Orchestrator接收到 VNFM发送的故障综合信息, Orchestrator检测是 否能够进行自愈处理, 与 VNFM的自愈判断相类似, Orchestrator查询本地 故障修复策略, 如果能够进行处理且修复成功, 则将故障综合信息中的故障 状态置为 "已修复" 并向 0SS/BSS上报。 如果 Orchestrator不能够进行修复 处理或者能进行修复处理但是修复失败, 则将 VNF的故障综合信息的故障 状态置为 "未修复" 并向 0SS/BSS上报。 应理解, 由于 Orchestrator负责编 排管理资源, 并实现 NFV服务, 因此 Orchestrator具有较高的管理权限以及 处理能力, 能够修复大部分的故障。 只有极少数的无法处理或者修复失败的 故障才会被上报的 0SS/BSS
707 , 0SS/BSS进行故障修复
0SS/BSS将该接收到的故障综合信息的故障状态改为 "处理中"。 然后 0SS/BSS根据故障修复策略中的方法进行故障恢复。 故障恢复后, 0SS/BSS 会收到 VNF实体发送的故障恢复通知, 之后将 0SS/BSS故障综合信息中的 故障状态修改为 "已修复"。 其中 0SS/BSS中的故障修复策略默认包含所有 故障类型的处理方法。 启硬件设备、 重新加载软件 ( Host 0S、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 ( VNF Forwarding Graph )。
本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行 综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 并 且对于正在处理的故障实体进行隔离处理,从而提高了故障处理的效率和准 确度, 且有效的防止了故障传染。
图 8是本发明另一实施例的故障管理的方法的交互图。 图 8所示的方法 可以由图 1所示的 NFV系统执行。
801 , VIM获取故障信息。
当 VIM检测到 NFVI中的任意 HW、 Host 0S、 Hypervisor和 VM实体 发生故障时, VIM获取发生故障的 NFVI实体的故障信息。 具体地, 获取故 障信息可以是由发生故障的 NFVI实体生成并上报给 VIM的,也可以是 VIM 根据检测到的故障在本地生成的。 具体地, VIM检测 NFVI实体发生故障的 方法与上述图 6步骤 601所述的方法相类似, 此处不再赘述。
802, VIM生成故障综合信息
在 VIM接收到第一 NFVI 实体发送的故障信息, 或者 VIM根据第一 NFVI实体发生的故障生成故障信息后, VIM需要根据收集与第一 NFVI实 体相关联的其他 NFVI实体的故障信息, 以生成故障综合信息, 以便于进行 综合处理。具体地,与上述图 6步骤 602所述的方法相类似,此处不再赘述。
803, 报警重复检测
VIM生成故障综合信息后, 可以在 VIM本地检测已生成的故障综合信 息, 确定是否存在相同的信息。 具体检测方法与上述图 6步骤 603所述的方 法相类似, 此处不再赘述。
804, VIM自愈判断
当 VIM有故障综合信息生成, VIM首先可以判断故障综合信息中的故 障类型是否为 VIM能够处理的故障类型。具体判断方法与上述图 6步骤 604 所述的方法相类似, 此处不再赘述。
805aVIM能够进行自愈处理
如果 VIM判断能够处理,则根据故障修复方法对 NFVI实体进行故障修 复。如果故障修复成功,并且具有关联关系的 NFVI实体的故障都得到修复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
如果故障综合信息包含多个 NFVI实体, 被优先处理的 NFVI实体的故 障修复成功, 但是其他关联的 NFVI实体的故障依然存在, 则重复进行 804 的步骤, 对余下的依然存在故障的 NFVI实体中优先级最高的 NFVI实体进 行判断, 并修复, 直到该故障综合信息中的所有 NFVI实体的故障都得到修 复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。 具体方法 与上述图 6步骤 605a所述的方法相类似, 此处不再赘述。
此外, 可以将正在进行修复的 NFVI实体进行隔离, 以避免该故障体与 相邻的其他实体交互而导致进一步的故障传染。
805bVIM不能够进行自愈处理则上报 VNFM
如果 VIM中的故障修复策略中不包含待修复的 NFVI实体的故障类型, 则 VIM可以将故障综合信息的故障状态置为 "未修复" 并通过 Vi-Vnfm接 口上报 VNFM。 当 VNFM接收到 VIM发送的故障综合信息, VNFM检测是否能够进行 自愈处理, 与 VIM的自愈判断相类似, VNFM查询本地故障修复策略, 如 果能够进行处理且修复成功,则将故障综合信息中的故障状态置为 "已修复" 并向 Orchestrator上报。 如果 VNFM不能够进行修复处理或者能进行修复处 理但是修复失败, 则将 NFVI的故障综合信息的故障状态置为 "未修复" 并 向 Orchestrator上才艮。
806, Orchestrator自愈判断
当 Orchestrator接收到 VNFM发送的 NFVI的故障综合信息, Orchestrator 检测是否能够进行自愈处理, 与 VIM的自愈判断相类似, Orchestrator查询 本地故障修复策略, 如果能够进行处理且修复成功, 则将故障综合信息中的 故障状态置为 "已修复" 并向 0SS/BSS上报。 如果 Orchestrator不能够进行 修复处理或者能进行修复处理但是修复失败, 则将 NFVI的故障综合信息的 故障状态置为 "未修复" 并向 0SS/BSS上报。 应理解, 由于 Orchestrator负 责编排管理资源, 并实现 NFV服务, 因此 Orchestrator具有较高的管理权限 以及处理能力, 能够修复大部分的故障。 只有极少数的无法处理或者修复失 败的故障才会被上报的 0SS/BSS。
807, OSS/BSS进行故障修复
0SS/BSS将该接收到的故障综合信息的故障状态改为 "处理中"。 然后 OSS/BSS根据故障修复策略中的方法进行故障恢复。 故障恢复后, OSS/BSS 会收到 NFVI实体发送的故障恢复通知,之后将 OSS/BSS故障综合信息中的 故障状态修改为 "已修复"。 其中 OSS/BSS中的故障修复策略默认包含所有 故障类型的处理方法。 启硬件设备、 重新加载软件 ( Host 0S、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF ,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 ( VNF Forwarding Graph )。
本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行 综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 并 且对于正在处理的故障实体进行隔离处理,从而提高了故障处理的效率和准 确度, 且有效的防止了故障传染。
图 9是本发明另一实施例的故障管理的方法的交互图。 图 9所示的方法 可以由图 1所示的 NFV系统执行。
901 , VIM获取故障信息。
当 VIM检测到 NFVI中的任意 HW、 Host OS、 Hypervisor和 VM发生 故障时, VIM获取发生故障的 NFVI实体的故障信息。 具体地, 获取故障信 息可以是由发生故障的 NFVI实体生成并上报给 VIM的, 也可以是 VIM根 据检测到的故障在本地生成的。 具体地, VIM检测 NFVI实体发生故障的方 法与上述图 6步骤 601所述的方法相类似, 此处不再赘述。
902, VIM生成故障综合信息
在 VIM接收到第一 NFVI 实体发送的故障信息, 或者 VIM根据第一 NFVI实体发生的故障生成故障信息后, VIM需要根据收集与第一 NFVI实 体相关联的其他 NFVI实体的故障信息, 以生成故障综合信息, 以便于进行 综合处理。具体地,与上述图 6步骤 602所述的方法相类似,此处不再赘述。
903, 报警重复检测
VIM生成故障综合信息后, 可以在 VIM本地检测已生成的故障综合信 息, 确定是否存在相同的信息。 具体检测方法与上述图 6步骤 603所述的方 法相类似, 此处不再赘述。
904, VIM自愈判断
当 VIM有故障综合信息生成, VIM首先可以判断故障综合信息中的故 障类型是否为 VIM能够处理的故障类型。具体判断方法与上述图 6步骤 604 所述的方法相类似, 此处不再赘述。
905aVIM能够进行自愈处理
如果 VIM判断能够处理,则根据故障修复方法对 NFVI实体进行故障修 复。如果故障修复成功,并且具有关联关系的 NFVI实体的故障都得到修复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
如果故障综合信息包含多个 NFVI实体, 被优先处理的 NFVI实体的故 障修复成功, 但是其他关联的 NFVI实体的故障依然存在, 则重复进行 904 的步骤, 对余下的依然存在故障的 NFVI实体中优先级最高的 NFVI实体进 行判断, 并修复, 直到该故障综合信息中的所有 NFVI实体的故障都得到修 复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。 具体方法 与上述图 6步骤 605a所述的方法相类似, 此处不再赘述。
此外, 可以将正在进行修复的 NFVI实体进行隔离, 以避免该故障体与 相邻的其他实体交互而导致进一步的故障传染。
905bVIM不能够进行自愈处理则上报 VNFM
如果 VIM中的故障修复策略中不包含待修复的 NFVI实体的故障类型, 则 VIM可以将故障综合信息的故障状态置为 "未修复" 并通过 Vi-Vnfm接 口上报 VNFM。
当 VNFM接收到 VIM发送的故障综合信息, VNFM检测是否能够进行 自愈处理, 与 VIM的自愈判断相类似, VNFM查询本地故障修复策略, 如 果能够进行处理且修复成功,则将故障综合信息中的故障状态置为 "已修复" 并向 Orchestrator上报。 如果 VNFM不能够进行修复处理或者能进行修复处 理但是修复失败, 则将 NFVI的故障综合信息的故障状态置为 "未修复" 并 将故障综合信息返回给 VIM。
906, Orchestrator自愈判断
之后 VIM将 NFVI的故障综合信息通过 Or-Vi接口上报给 Orchestrator,
Orchestrator检测是否能够进行自愈处理, 与 VIM 的自愈判断相类似, Orchestrator查询本地故障修复策略,如果能够进行处理且修复成功,则将故 障综合信息中的故障状态置为 "已修复 "并向 0SS/BSS上报。如果 Orchestrator 不能够进行修复处理或者能进行修复处理但是修复失败, 则将 NFVI的故障 综合信息的故障状态置为 "未修复" 并向 0SS/BSS 上报。 应理解, 由于 Orchestrator负责编排管理资源,并实现 NFV服务, 因此 Orchestrator具有较 高的管理权限以及处理能力, 能够修复大部分的故障。 只有极少数的无法处 理或者修复失败的故障才会被上报的 0SS/BSS。
907 , 0SS/BSS进行故障修复
0SS/BSS将该接收到的故障综合信息的故障状态改为 "处理中"。 然后
0SS/BSS根据故障修复策略中的方法进行故障恢复。 故障恢复后, 0SS/BSS 会收到 NFVI实体发送的故障恢复通知,之后将 0SS/BSS故障综合信息中的 故障状态修改为 "已修复"。 其中 0SS/BSS中的故障修复策略默认包含所有 故障类型的处理方法。 启硬件设备、 重新加载软件 ( Host 0S、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 ( VNF Forwarding Graph )。
应理解, 图 6、 图 8和图 9为 VIM对 NFVI实体的故障的修复和管理过 程, 图 7为 VNFM对 VNF实体的故障的修复和管理过程。 VIM对 NFVI实 体、 VNFM对 VNF实体的修复和管理这两个过程可以为相对独立的两个过 程, 也可以为同时进行的两个过程, 本发明对此不做限定。
本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行 综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 并 且对于正在处理的故障实体进行隔离处理,从而提高了故障处理的效率和准 确度, 且有效的防止了故障传染。
图 10本发明另一实施例的故障管理的方法的交互图。 图 10所示的方法 可以由图 1所示的 NFV系统执行。
1001a, VIM获取故障信息。
当 VIM检测到 NFVI中的任意 HW、 Host OS、 Hypervisor和 VM发生 故障时, VIM获取发生故障的 NFVI实体的故障信息。 具体地, 获取故障信 息可以是由发生故障的 NFVI实体生成并上报给 VIM的, 也可以是 VIM根 据检测到的故障在本地生成的。
VIM检测 NFVI实体发生故障的方法可以有以下几种方法:
为了方便描述, 以下以第一 NFVI实体发生故障为例进行描述, 该第一 NFVI实体可以为 NFVI中的任意 HW、 Host OS、 Hypervisor和 VM实体。 其中, 实体可以包括硬件实体或软件实体。
方法一,
第一 NFVI实体发生故障时, 第一 NFVI实体生成故障信息, 该故障信 息至少包含用于唯一标识第一 NFVI实体的故障实体标识, 通过该标识可以 唯一确定发生故障的第一 NFVI实体的实际位置或在拓朴关系中的位置。 故 障信息还包含故障标识, 用于唯一标识一个故障信息。 该故障信息还包含有 故障类型, 用于表示该故障发生的原因, 例如断电、 过载、 无故障等。 此外, 故障信息还可以包含运行状态和故障时间, 运行状态用于标记第一 NFVI实 体当前是否能够正常运行, 故障时间可以用于记录故障发生的时间。 作为一 个例子, 故障信息的格式可以如上述表一所示。
第一 NFVI 生成上述格式的故障信息后, 可以通过 Nf-Vi接口发送给 VIM,可选地,还可以同时通过 EMS将故障信息发送给 OSS/BSS以供管理、 记录、 呈现。
方法二,
VIM可以周期性地或者在需要的时候向第一 NFVI实体发送指示消息, 指示第一 NFVI实体进行故障检测, 第一 NFVI实体如果检测到故障可以向 VIM返回与上述表一相类似的故障信息, 如果第一 NFVI没有故障, 可以不 返回任何消息, 也可以返回故障类型为 "无故障", 运行状态为 "正常" 的 如表一所示的故障信息。
方法三,
第一 NFVI实体可以周期性地向 VIM发送表示第一 NFVI实体运行正常 的心跳指示消息。 VIM则周期性地接收到第一 NFVI实体的心跳, 感知到第 一 NFVI实体工作正常,当第一 NFVI实体心跳中断,则 VIM判定第一 NFVI 实体发生故障。 VIM可以生成第一 NFVI的故障信息, 具体格式与上述表一 的故障信息相类似, 此处不再赘述。
当 NFVI实体发生断电等突然性事故而无法上报故障信息时, VIM依然 能够在第一时间感知到第一 NFVI实体发生故障。
方法四,
VIM可以周期性地或者在需要的时候对 NFVI进行故障检测,之后 VIM 根据故障检测结果生成第一 NFVI的故障信息, 具体格式与上述表一的故障 信息相类似, 此处不再赘述。
综上所述, VIM检测 NFVI实体的故障可以通过以上任意一种方法进行, 当然可以通过多种方法结合进行检测, 例如, 可以将方法一和方法三结合, NFVI实体周期性向 VIM发送心跳, 在发生故障时向 VIM发送故障信息, 如果 NFVI实体发生灾难性故障无法上报故障信息,则 VIM可以通过心跳停 止感知到 NFVI实体发生故障。
1001b, VNFM获取故障信息。
当 VNFM检测到 VNF中的任意 VNF实体发生故障时, VNFM获取发 生故障的 VNF实体的故障信息。 具体地, 获取故障信息可以是由发生故障 的 VNF实体生成并上报给 VNFM的,也可以是 VNFM根据检测到的故障在 VNFM本地生成的。
VNFM检测 VNF实体发生故障的方法可以有以下几种方法:
为了方便描述, 以下以第一 VNF实体发生故障为例进行描述, 该第一
VNF实体可以为 VNF中的任意 VNF实体。其中, 实体可以包括硬件实体或 软件实体或实例。
方法一,
第一 VNF实体发生故障时, 第一 VNF实体生成故障信息, 该故障信息 至少包含用于唯一标识第一 VNF实体的故障实体标识, 通过该标识可以唯 一确定发生故障的第一 VNF实体的实际位置或在拓朴关系中的位置。 该故 障信息还包含有故障类型, 用于表示该故障发生的原因或者结果。 此外, 故 障信息还可以包含运行状态和故障时间, 运行状态用于标记第一 VNF实体 当前是否能够正常运行, 故障时间可以用于记录故障发生的时间。 作为一个 例子, 故障信息的格式可以如上述表三所示。
第一 VNF生成上述格式的故障信息后, 可以通过 Ve-Vnfm接口发送给 VNFM, 可选地, 还可以同时通过 vEMS将故障信息发送给 OSS/BSS以供 管理、 记录、 呈现。
方法二,
VNFM可以周期性地或者在需要的时候向第一 VNF实体发送指示消息, 指示第一 VNF 实体进行故障检测, 第一 VNF 实体如果检测到故障可以向 VNFM返回与上述表三相类似的故障信息, 如果第一 VNF没有故障, 可以 不返回任何消息, 也可以返回故障类型为 "无故障", 运行状态为 "正常" 的如表三所示的故障信息。
方法三,
第一 VNF实体可以周期性地向 VNFM发送表示第一 VNF实体运行正 常的心跳指示消息。 VNFM则周期性地接收到第一 VNF实体的心跳, 感知 到第一 VNF实体工作正常, 当第一 VNF实体心跳中断, 则 VNFM判定第 一 VNF实体发生故障。 VNFM可以生成第一 VNF的故障信息, 具体格式与 上述表三的故障信息相类似, 此处不再赘述。
当 VNF实体发生突然性故障而无法上报故障信息时, VNFM依然能够 在第一时间感知到第一 VNF实体发生故障。
方法四, VNFM 可以周期性地或者在需要的时候对 VNF 进行故障检测, 之后 VNFM根据故障检测结果生成第一 VNF的故障信息, 具体格式与上述表三 的故障信息相类似, 此处不再赘述。
综上所述, VNFM检测 VNF实体的故障可以通过以上任意一种方法进 行, 当然可以通过多种方法结合进行检测, 例如, 可以将方法一和方法三结 合, VNF实体周期性向 VNFM发送心跳, 在发生故障时向 VNFM发送故障 信息, 如果 VNF实体发生灾难性故障无法上报故障信息, 则 VNFM可以通 过心跳停止感知到 VNF实体发生故障。
应理解, 步骤 1001a和 1001b可以为两个相对独立的过程, 也可以为两 个相关的过程, 在本发明实施例当中可以理解为基本同时发生的两个过程, 也就是说, 本发明实施例是在 NFVI和 VNF发生关联性故障的情况下进行 故障管理和修复的具体描述。
1002a, VIM生成故障综合信息
在 VIM接收到第一 NFVI 实体发送的故障信息, 或者 VIM根据第一 NFVI实体发生的故障生成故障信息, 即步骤 1001a后, VIM需要根据收集 与第一 NFVI实体相关联的其他 NFVI实体的故障信息, 以生成故障综合信 息, 以便于进行综合处理。
具体地, 由于 HW、 Host OS、 Hypervisor, VM实体之间存在有关联关 系, 因此当第一 NFVI实体发生故障时, 往往与第一 NFVI实体有关联关系 的实体也会发生故障。 图 6b示例性地示出了 HW、 Host OS、 Hypervisor, VM 实体之间的关联关系。 例如, 与 HW1 有关联关系的包括 Host 0S1、 Hypervisorl , VM1和 VM2。 也就是说, 当 HW1发生故障时, 建立在其上 的虚拟化实体 Host 0S1、 Hypervisorl、 VM1和 VM2都会发生故障。
此时, VIM可以收集 Host 0S1、 Hypervisorl , VM1和 VM2上报的故 障信息, 结合 HW1的故障信息生成故障综合信息。 具体地, 可以生成如上 述表二所示的故障综合信息, 其中 HW、 Host OS、 Hypervisor和 VM实体的 故障信息格式与上述表一相类似。 应理解, 表二所示的故障综合信息为一个 具体的例子, 故障综合信息具体包含哪些实体的故障信息根据关联关系而 定。 其中故障综合信息刚生成时可以将故障状态置为 "未处理"。
1002b, VNFM生成故障综合信息
在 VNFM接收到第一 VNF实体发送的故障信息,或者 VNFM根据第一 VNF实体发生的故障生成故障信息后, 即步骤 1001b后, VNFM可以根据 第一 VNF的故障信息生成故障综合信息。 可选地, VNFM可以收集与第一 VNF实体相关联的其他 VNF实体的故障信息, 以生成故障综合信息, 以便 于进行综合处理。
具体地, 由于 VNF实体之间存在有关联关系, 因此当第一 VNF实体发 生故障时,往往与第一 VNF实体有关联关系的其他 VNF实体也会发生故障。 图 7b示例性地示出了 VNF实体之间的关联关系。 例如, VNF1与 VNF2都 基于 VM1 , 即 VNF1与 VNF2之间具有关联关系。 当 VNF1发生了故障, VNF2有可能也发生了故障。
此时, VNFM可以收集 VNF1上报的故障信息, 结合 VNF2的故障信息 生成故障综合信息。 具体地, 可以生成如上述表四所示的故障综合信息。
其中 VNF1 , VNF2实体的故障信息格式与上述表三相类似。 应理解, 表四所示的故障综合信息为一个具体的例子,故障综合信息具体包含哪些实 体的故障信息根据关联关系而定。其中故障综合信息刚生成时可以将故障状 态置为 "未处理"。
同样地, 步骤 1002a和 1002b可以为两个相对独立的过程, 也可以为两 个相关的过程, 在本发明实施例当中可以理解为基本同时发生的两个过程。
1003a, VIM报警重复检测
VIM生成故障综合信息后, 可以在 VIM本地检测已生成的故障综合信 息, 确定是否存在相同的信息。 具体地, 由于一个 NFVI实体发生故障后, 与之具有关联关系的发生故障的 NFVI实体都会上报故障信息,因此 VIM很 可能就同一个故障生成多个相同的故障综合信息。 例如, HW1 发生故障, 与 HW1具有关联关系的 Host 0S1、 Hypervisorl , VM1和 VM2也发生故障 并且与 HW1执行相同的操作, VIM在进行关联故障信息收集后会生成多个 同样的故障综合信息, 此时可以只处理其中的一个故障综合信息, 将其他相 同的故障综合信息丟弃。 应理解, 这里的相同的故障综合信息指的是 HW、 Host OS、 Hypervisor和 VM故障信息部分相同, 故障状态可以不同。
具体地,可以通过故障综合信息的故障状态来保留或丟丟弃故障综合信 息, 例如, 刚生成的故障综合信息的故障状态为 "未处理", 对该故障综合 信息进行 警重复检测, 如果发现故障状态为 "处理中" 的相同的故障综合 信息, 则对未处理的故障综合信息做丟弃处理。 保留即继续执行对故障状态 为 "处理中" 的故障综合信息中的故障的处理。
1003b, VNFM报警重复检测
VNFM生成故障综合信息后, 可以在 VNFM本地检测已生成的故障综 合信息, 确定是否存在相同的信息。 具体地, 由于一个 VNF实体发生故障 后, 与之具有关联关系的发生故障的 VNF 实体都会上报故障信息, 因此 VNFM很可能就同一个故障生成多个相同的故障综合信息。 例如, VNF1发 生故障, 与 VNF1具有关联关系的 VNF2也发生故障并且与 VNF1执行相同 的操作, VNFM 在进行关联故障信息收集后会生成多个同样的故障综合信 息, 此时可以只处理其中的一个故障综合信息, 将其他相同的故障综合信息 丟弃。 应理解, 这里的相同的故障综合信息指的是 VNF状态信息部分相同, 故障状态可以不同。
具体地,可以通过故障综合信息的故障状态来保留或丟丟弃故障综合信 息, 例如, 刚生成的故障综合信息的故障状态为 "未处理", 对该故障综合 信息进行 警重复检测, 如果发现故障状态为 "处理中" 的相同的故障综合 信息, 则对未处理的故障综合信息做丟弃处理。 保留即继续执行对故障状态 为 "处理中" 的故障综合信息中的故障的处理。
1004a, VIM自愈判断
当 VIM有故障综合信息生成, VIM首先可以判断故障综合信息中的故 障类型是否为 VIM能够处理的故障类型。
具体地, VIM中具有故障修复策略,该故障修复策略包括故障实体标识、 故障类型和故障修复方法的映射关系。可以通过判断故障综合信息中的故障 类型是否存在于故障修复策略中而确定是否能够进行处理。 例如, HW1 的 故障类型为 "低性能", 相对应的故障修复方法为 "重启"。
此外,当故障综合信息中包含多个关联的 NFVI实体的故障信息时, VIM 可以根据 NFVI实体的优先级确定针对哪个 NFVI实体的故障信息中的故障 类型进行自愈判断。优先级为: HW高于 Host OS高于 Hypervisor高于 VM。 例如, 如表二所示, 当故障综合信息包含 HW1、 Host 0S1、 Hypervisorl , VM1和 VM2的故障信息时, VIM可以优先处理 HW1的故障, 也就是说, 根据 HW1的故障信息中的故障类型,例如 "低性能",确定故障修复方法"重 启,,。
1004b, VNFM自愈判断 当 VNFM有故障综合信息生成, VNFM首先可以判断故障综合信息中 的故障类型是否为 VNFM能够处理的故障类型。
具体地, VNFM中具有故障修复策略, 该故障修复策略包括故障实体标 识、 故障类型和故障修复方法的映射关系。 可以通过判断故障综合信息中的 故障类型是否存在于故障修复策略中而确定是否能够进行处理。例如, VNF1 的故障类型为 "低性能", 相对应的故障修复方法为 "增加一个 VNF实例"。 启硬件设备、 重新加载软件 ( Host OS、 Hypervisor等)、 迁移 VM、 重新加 载 VNF安装软件、重新实例化 VNF,增力口 VNF实例, 迁移 VNF (即给 VNF 重新分配资源), 重新实例化 VNF转发图 (VNF Forwarding Graph )。
1005aVIM能够进行自愈处理
如果 VIM判断能够处理,则根据故障修复方法对 NFVI实体进行故障修 复。如果故障修复成功,并且具有关联关系的 NFVI实体的故障都得到修复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
如果故障综合信息包含多个 NFVI实体, 被优先处理的 NFVI实体的故 障修复成功,但是其他关联的 NFVI实体的故障依然存在,则重复进行 1004a 的步骤, 对余下的依然存在故障的 NFVI实体中优先级最高的 NFVI实体进 行判断, 并修复, 直到该故障综合信息中的所有 NFVI实体的故障都得到修 复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。
具体地, 对于能够处理的故障综合信息, VIM可以将修复状态置为 "处 理中"以防止对后续生成的相同的 "未处理"的故障综合信息进行重复处理。
修复成功的 NFVI实体可以通过上报运行状态为 "正常" 的类似于表一 的故障信息来通知 VIM故障修复成功。 当故障综合信息中具有关联关系的 所有的 NFVI实体的故障都得到修复, VIM可以将故障综合信息的故障状态 置为 "已修复" 并通过 Or-Vi接口上 ·fe OrchestratoL 应理解, 修复成功也可 以通过预定义的信令进行上报, 本发明对此不做限定。
此外, 可以将正在进行修复的 NFVI实体进行隔离, 以避免该故障体与 相邻的其他实体交互而导致进一步的故障传染。
1005bVNFM能够进行自愈处理
如果 VNFM判断能够处理, 则根据故障修复方法对 VNF实体进行故障 修复。 如果故障修复成功, 并且具有关联关系的 VNF实体的故障都得到修 复, 则通知 Orchestrator修复成功, 并终结该故障修复处理过程。 如果故障综合信息包含多个 VNF实体,被优先处理的 VNF实体的故障 修复成功, 但是其他关联的 VNF实体的故障依然存在, 则重复进行 1004b 的步骤, 对余下的依然存在故障的 VNF实体进行判断, 并修复, 直到该故 障综合信息中的所有 VNF实体的故障都得到修复, 则通知 Orchestrator修复 成功, 并终结该故障修复处理过程。
具体地,对于能够处理的故障综合信息, VNFM可以将修复状态置为 "处 理中"以防止对后续生成的相同的 "未处理"的故障综合信息进行重复处理。
修复成功的 VNF实体可以通过上报运行状态为 "正常" 的类似于表三 的故障信息来通知 VNFM故障修复成功。 当故障综合信息中具有关联关系 的所有的 VNF实体的故障都得到修复, VNFM可以将故障综合信息的故障 状态置为 "已修复" 并通过 Or-Vnfm接口上报 Orchestrator。 应理解, 修复 成功也可以通过预定义的信令进行上报, 本发明对此不做限定。
此外, 可以将正在进行修复的 VNF实体进行隔离, 以避免该故障体与 相邻的其他实体交互而导致进一步的故障传染。
1005c VIM不能够进行自愈处理
经过步骤 1005a中的判断, 如果 VIM中的故障修复策略中不包含待修 复的 NFVI实体的故障类型, VIM向 VNFM请求与第一 VNFI实体相关联的 VNF实体的故障信息。 之后 VIM接收 VNFM发送的与第一 VNFI实体相关 联的 VNF实体的故障信息, 并将接收的故障信息加入原有 NFVI的故障综 合信息, 然后通过 Or- Vi接口向 Orchestrator上报综合后的故障综合信息。 例如,如上述图 6a所示的关联关系,与 HW1相关联的 NFVI实体有 Host 0S1、 Hypervisorl , VM1和 VM2, 进一步向 VNF关联, VNFI和 VNF2也与 HW1 具有关联关系, 如果其中的 VNF1也发生了故障, 即 VNFM处有 VNF1的 故障信息, 则 VNFM将 VNF1的故障信息通过 Vi- Vnfm接口发送给 VIM , 以便 VIM进行综合处理上报。
1005d VNFM不能够进行自愈处理
经过步骤 1005b中的判断, 如果 VNFM中的故障修复策略中不包含待 修复的 VNF实体的故障类型, VNFM向 VIM请求与第一 VNF实体相关联 的 NFVI实体的故障信息。之后 VNFM接收 VIM发送的与第一 VNF实体相 关联的 NFVI实体的故障信息, 并将接收的故障信息加入原有 VNF的故障 综合信息, 然后通过 Or-Vnfm接口向 Orchestrator上报综合后的故障综合信 息。 例如, 如上述图 6a所示的关联关系, 与 VNF1相关联的 NFVI实体有 VM1、 Host OSl、 Hypervisorl , HWl和 HW2, 如果其中的 VM1、 Host OSl、 Hypervisorl , HWl也发生了故障则 VIM将 VM1、 Host 0S1、 Hypervisorl 和 HWl的故障信息通过 Vi-Vnfm接口发送给 VNFM ,以便 VNFM进行综合 处理上报。
1006, Orchestrator自愈判断
Orchestrator接收到 VNFM或 VIM上报的经过综合处理的故障综合信息 ( 1005c或 1005d ), Orchestrator检测是否能够对该故障综合信息进行自愈处 理, 与 VIM的自愈判断相类似, Orchestrator查询本地故障修复策略, 如果 能够进行处理且修复成功, 则将故障综合信息中的故障状态置为 "已修复" 并向 OSS/BSS上报。 如果 Orchestrator不能够进行修复处理或者能进行修复 处理但是修复失败, 则将 NFVI的故障综合信息的故障状态置为 "未修复" 并向 0SS/BSS上报。 应理解, 由于 Orchestrator负责编排管理资源, 并实现 NFV服务, 因此 Orchestrator具有较高的管理权限以及处理能力, 能够修复 大部分的故障。 只有极少数的无法处理或者修复失败的故障才会被上报的 0SS/BSS。
1007, OSS/BSS进行故障修复
0SS/BSS将该接收到的故障综合信息的故障状态改为 "处理中"。 然后 OSS/BSS根据故障修复策略中的方法进行故障恢复。 故障恢复后, OSS/BSS 会收到 NFVI实体发送的故障恢复通知,之后将 OSS/BSS故障综合信息中的 故障状态修改为 "已修复"。 其中 OSS/BSS中的故障修复策略默认包含所有 故障类型的处理方法。
本发明实施例提供的故障管理方法, 通过 VIM获取硬件和 /或软件实体 的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够 实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行 综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理, 并 且对于正在处理的故障实体进行隔离处理,从而提高了故障处理的效率和准 确度, 且有效的防止了故障传染。
图 11 是本发明一个实施例的虚拟化基础设施管理 VIM 实体的示意框 图。 图 11所示的 VIM实体 1100包括获取单元 1101、 生成单元 1102和处理 单元 1103。
获取单元 1101获取网络功能虚拟化基础设施 NFVI实体的包含故障实体 标识和故障类型的第一故障信息, 第一故障信息用于指示具有故障实体标识 的第一 NFVI实体发生故障。
生成单元 1102根据获取单元 1101获取的第一故障信息生成第一故障综 合信息, 第一故障综合信息包含第一故障信息和第一故障信息的关联故障信 息;
处理单元 1103根据生成单元 1102生成的第一故障综合信息进行故障修 复或上 处理。
本发明实施例提供的 VIM实体 1100获取硬件和 /或软件实体的故障信 息,对具有关联关系的故障信息进行综合处理,从而能够实现能够实现 NFV 环境下的故障上4艮及处理。
可选地,作为一个实施例, VIM实体 1100还包括确定单元和接收单元, 获取单元具体用于:通过接收单元接收第一 NFVI实体发送的第一故障信息; 或者通过确定单元确定第一 NFVI实体发生故障, 并根据第一 NFVI实体发 生的故障生成第一故障信息。
可选地, 作为一个实施例, 第一 NFVI实体为 NFVI实体中的任意一个 硬件 HW、 主操作系统 Host OS、 虚拟机管理器或虚拟机 VM实体, 生成单 元 1102具体用于:通过确定单元确定与第一 NFVI实体相关联的 NFVI实体 发送的故障信息为第一故障信息的关联故障信息; 生成包含有第一故障信息 和关联故障信息的第一故障综合信息。
可选地,作为一个实施例,处理单元 1103包括发送单元,处理单元 1103 具体用于: 根据第一故障综合信息中的第一故障信息的故障类型或者关联故 障信息的故障类型, 通过确定单元确定 VIM实体 1100是否包含与第一故障 信息的故障类型或者关联故障信息的故障类型相对应的故障修复策略; 在 VIM实体 1100包含与第一故障信息的故障类型或者关联故障信息的故障类 型相对应的故障修复策略时,根据故障修复策略修复第一 NFVI实体和 /或与 第一 NFVI实体相关联的 NFVI实体的故障; 或者在 VIM实体 1100不包含 与第一故障信息的故障类型或者关联故障信息的故障类型相对应的故障修 复策略时, 通过发送单元向 VNFM发送第一故障综合信息或者向编排器发 送第一故障综合信息。 可选地, 作为一个实施例, 处理单元 1103具体用于: 通过确定单元在 第一 NFVI实体和与第一 NFVI实体相关联的 NFVI实体中确定优先级最高 的 NFVI实体, 其中, HW的优先级高于 Host OS的优先级, Host OS的优 先级高于虚拟机管理器的优先级, 虚拟机管理器的优先级高于 VM 的优先 级;根据优先级最高的 NFVI实体的故障类型,通过确定单元确定 VIM实体 1100是否包含相对应的故障修复策略;在 VIM实体 1100包含与优先级最高 的 NFVI实体的故障类型相对应的故障修复策略时, 根据故障修复策略修复 优先级最高的 NFVI实体的故障。
可选地, 作为一个实施例, 发送单元具体用于: 在故障修复成功时, 向 编排器发送成功指示消息; 在故障修复失败时, 向 VNFM发送第一故障综 合信息或者向编排器发送第一故障综合信息。
可选地, 作为一个实施例, 接收单元还用于: 接收 VNFM发送的用于 指示 VNFM无法处理第一故障综合信息的指示消息; 发送单元还用于: 向 编排器发送第一故障综合信息。
可选地, 作为一个实施例, 处理单元 1103还用于: 向 VNFM请求与第 一 NFVI实体相关联的 VNF实体的故障信息;将与第一 NFVI实体相关联的 VNF实体的故障信息加入第一故障综合信息。
可选地, 作为一个实施例, 接收单元还用于: 接收 VNFM发送的请求 信息, 请求信息用于向 VIM实体 1100请求与发生故障的 VNF实体相关联 的 NFVI实体的故障信息;发送单元还用于向 VNFM发送与发生故障的 VNF 实体相关联的 NFVI实体的故障信息。
可选地,作为一个实施例, VIM实体 1100还包括检测单元和删除单元, 检测单元具体用于: 根据第一故障综合信息检测 VIM实体 1100是否包含与 第一故障综合信息相同的故障综合信息;删除单元具体用于在 VIM实体 1100 包含与第一故障综合信息相同的故障综合信息时, 删除第一故障综合信息。
本发明实施例提供的 VIM实体 1100硬件和 /或软件实体的故障信息,对 具有关联关系的故障信息进行综合处理, 从而能够实现能够实现 NFV环境 下的故障上报及处理。 此外, 由于对相关联的故障信息进行综合处理, 并且 通过重复报警检测对相同的故障综合信息进行删除处理,从而提高了故障处 理的效率和准确度。
图 12是本发明一个实施例的虚拟网络功能管理 VNFM 实体的示意框 图。 图 12所示的 VNFM实体 1200包括获取单元 1201、 生成单元 1202和处 理单元 1203。
获取单元 1201获取虚拟网络功能 VNF实体的包含故障实体标识和故障 类型的第二故障信息,第二故障信息用于指示具有故障实体标识的第一 VNF 实体发生故障。 生成单元 1202根据第二故障信息生成第二故障综合信息。 处理单元 1203根据第二故障综合信息进行故障修复或上报处理。 息,对具有关联关系的故障信息进行综合处理,从而能够实现能够实现 NFV 环境下的故障上4艮及处理。
可选地, 作为一个实施例, VNFM实体 1200还包括确定单元和接收单 元, 获取单元具体用于: 通过接收单元接收第一 VNF实体发送的第二故障 信息; 或者通过确定单元确定第一 VNF实体发生故障, 并根据第一 VNF实 体发生的故障通过生成单元生成第二故障信息。
可选地, 作为一个实施例, 生成单元 1202具体用于: 通过确定单元确 定与第一 VNF实体相关联的 VNF实体发送的故障信息为第二故障信息的关 联故障信息; 生成包含有第二故障信息和关联故障信息的第二故障综合信 息。
可选地, 作为一个实施例, 处理单元 1203 包括发送单元, 处理单元具 体用于: 根据第二故障综合信息中的第二故障信息的故障类型或者关联故障 信息的故障类型, 通过确定单元确定 VNFM实体 1200是否包含与第二故障 信息的故障类型或者关联故障信息的故障类型相对应的故障修复策略; 在 VNFM实体 1200包含与第二故障信息的故障类型或者关联故障信息的故障 类型相对应的故障修复策略时, 根据故障修复策略修复第一 VNF实体和 /或 与第一 VNF实体相关联的 VNF实体的故障; 或者在 VNFM实体 1200不包 含与第二故障信息的故障类型或者关联故障信息的故障类型相对应的故障 修复策略时, 通过发送单元向编排器发送第二故障综合信息。
可选地, 作为一个实施例, 发送单元具体用于: 在故障修复成功时, 向 编排器发送成功指示消息; 在故障修复失败时, 向编排器发送第二故障综合 信息。
可选地, 作为一个实施例, 处理单元 1203还用于: 向虚拟化基础设施 管理器 VIM请求与第一 VNF实体相关联的 NFVI实体的故障信息, 其中 NFVI实体为 NFVI中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟机管 理器或虚拟机 VM实体; 将与第一 VNF实体相关联的 NFVI实体的故障信 息加入第二故障综合信息。
可选地, 作为一个实施例, 处理单元 1203还用于: 接收 VIM发送的第 一故障综合信息, 第一故障综合信息包含第一故障信息和第一故障信息的关 联故障信息, 第一故障信息用于指示第一 NFVI实体发生故障; 确定 VNFM 实体 1200是否包含与第一故障综合信息中的第一故障信息的故障类型或者 关联故障信息的故障类型相对应的故障修复策略; 在 VNFM实体 1200包含 与第一故障信息的故障类型或者关联故障信息的故障类型相对应的故障修 复策略时, 根据故障修复策略修复第一 NFVI实体和 /或与第一 NFVI实体相 关联的 NFVI实体的故障; 或者在 VNFM实体 1200不包含与第一故障信息 的故障类型或者关联故障信息的故障类型相对应的故障修复策略时, 向编排 器发送第一故障综合信息, 或者向 VIM发送用于指示 VNFM实体 1200无 法处理第一故障综合信息的指示消息, 以便于 VIM向编排器发送第一故障 综合信息。
可选地, 作为一个实施例, 处理单元 1203还具体用于: 根据第一故障 综合信息确定与第一 NFVI实体和 /或与第一 NFVI实体相关联的 NFVI实体 相关联的第一 VNF实体的故障信息; 将第一 VNF实体的故障信息加入第一 故障综合信息, 以便于 VNFM实体 1200对第一故障综合信息进行修复或上 处理。
可选地, 作为一个实施例, VNFM实体 1200还包括检测单元和删除单 元, 检测单元具体用于: 根据第二故障综合信息检测 VNFM实体 1200是否 包含与第二故障综合信息相同的故障综合信息;删除单元具体用于在 VNFM 实体 1200 包含与第二故障综合信息相同的故障综合信息时, 删除第二故障 综合信息。
可选地, 作为一个实施例, 接收单元还用于: 接收 VIM发送的请求信 息, 请求信息用于向 VNFM实体 1200请求与发生故障的 NFVI实体相关联 的 VNF实体的故障信息;发送单元还用于:向 VIM发送与发生故障的 NFVI 实体相关联的 VNF实体的故障信息。
本发明实施例提供的 VNFM实体 1200硬件和 /或软件实体的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够实现 NFV环 境下的故障上报及处理。 此外, 由于对相关联的故障信息进行综合处理, 并 且通过重复报警检测对相同的故障综合信息进行删除处理,从而提高了故障 处理的效率和准确度。
图 13是本发明一个实施例的编排器 Orchestrator实体的示意框图。图 12 所示的 Orchestrator实体 1300包括接收单元 1301和处理单元 1302。
接收单元 1301接收虚拟化基础设施管理器 VIM发送的第一故障综合信 息, 其中, 第一故障综合信息包括第一故障信息, 第一故障信息包含故障实 体标识和故障类型, 第一故障信息用于指示具有故障实体标识的第一网络功 能虚拟化基础设施 NFVI实体发生故障。 处理单元 1302根据第一故障综合 信息进行故障修复或上报处理。
或者
接收单元 1301接收虚拟网络功能管理器 VNFM发送的第二故障综合信 息, 其中, 第二故障综合信息包括第二故障信息, 第二故障信息包含故障实 体标识和故障类型, 第二故障信息用于指示具有故障实体标识的第一虚拟网 络功能 VNF实体发生故障。处理单元 1302根据第二故障综合信息进行故障 修复或上报处理。
本发明实施例提供的 Orchestrator实体 1300从 VNFM或 VIM获取硬件 和 /或软件实体的故障信息,对具有关联关系的故障信息进行综合处理,从而 能够实现能够实现 NFV环境下的故障上报及处理。
可选地, 作为一个实施例, 第一故障综合信息还包括: 与第一 NFVI实 体相关联的 NFVI实体的故障信息; 和 /或与第一 NFVI实体相关联的虚拟网 络功能 VNF实体的故障信息。
可选地, 作为一个实施例, 第二故障综合信息还包括: 与第一 VNF实 体相关联的 VNF实体的故障信息;和 /或与第一 VNF实体相关联的虚拟化基 础设施管理 NFVI实体的故障信息。
可选地, 作为一个实施例, 处理单元 1302具体用于: 根据第一故障综 合信息中的故障类型, 确定 Orchestrator实体 1300是否包含与故障类型相对 应的故障修复策略; 在 Orchestrator实体 1300包含与故障类型相对应的故障 修复策略时, 根据故障修复策略修复第一 NFVI实体和 /或与第一 NFVI实体 相关联的 NFVI实体的故障; 或者在 Orchestrator实体 1300不包含与故障类 型相对应的故障修复策略时, 向运营和业务支撑系统 0SS/BSS发送第一故 障综合信息。
可选地, 作为一个实施例, 处理单元 1302具体用于: 根据第二故障综 合信息中的故障类型, 确定 Orchestrator实体 1300是否包含与故障类型相对 应的故障修复策略; 在 Orchestrator实体 1300包含与故障类型相对应的故障 修复策略时,根据故障修复策略修复第一 VNF实体和 /或与第一 VNF实体相 关联的 VNF实体的故障; 或者在 Orchestrator实体 1300不包含与故障类型 相对应的故障修复策略时, 向运营和业务支撑系统 0SS/BSS发送第二故障 综合信息。
可选地, 作为一个实施例, 处理单元 1302具体用于: 根据第一故障综 合信息中的故障类型, 确定 Orchestrator实体 1300是否包含与故障类型相对 应的故障修复策略; 在 Orchestrator实体 1300包含与故障类型相对应的故障 修复策略时, 根据故障修复策略修复第一 NFVI实体和与第一 NFVI实体相 关联的 NFVI实体的故障和与第一 NFVI实体相关联的 VNF实体的故障;或 者在 Orchestrator实体 1300不包含与故障类型相对应的故障修复策略时, 向 0SS/BSS发送第一故障综合信息。
可选地, 作为一个实施例, 处理单元 1302具体用于: 根据第二故障综 合信息中的故障类型, 确定 Orchestrator实体 1300是否包含与故障类型相对 应的故障修复策略; 在 Orchestrator实体 1300包含与故障类型相对应的故障 修复策略时,根据故障修复策略修复第一 VNF实体和与第一 VNF实体相关 联的 VNF实体的故障和与第一 VNF实体相关联的 NFVI实体的故障; 或者 在 Orchestrator实体 1300不包含与故障类型相对应的故障修复策略时, 向 0SS/BSS发送第二故障综合信息。
可选地, 作为一个实施例, Orchestrator实体 1300还包括检测单元和删 除单元, 检测单元用于: 根据第一 /第二故障综合信息检测 Orchestrator实体 1300 是否包含与第一 /第二故障综合信息相同的故障综合信息; 删除单元用 于在 Orchestrator实体 1300包含与第一 /第二故障综合信息相同的故障综合信 息时, 删除第一 /第二故障综合信息。
本发明实施例提供的 Orchestrator实体 1300从 VIM或 VNFM获取硬件 和 /或软件实体的故障信息,对具有关联关系的故障信息进行综合处理,从而 能够实现能够实现 NFV环境下的故障上报及处理。 此外, 由于对相关联的 故障信息进行综合处理, 并且通过重复报警检测对相同的故障综合信息进行 删除处理, 从而提高了故障处理的效率和准确度。
图 14是本发明另一实施例的 VIM实体的示意框图。 图 14的 VIM实体 1400包括处理器 1401和存储器 1402。处理器 1401和存储器 1402通过总线 系统 1403相连。 实体的包含故障实体标识和故障类型的第一故障信息, 第一故障信息用于指 示具有故障实体标识的第一 NFVI实体发生故障; 根据第一故障信息生成第 一故障综合信息, 第一故障综合信息包含第一故障信息和第一故障信息的关 联故障信息; 根据第一故障综合信息进行故障修复或上报处理。
本发明实施例提供的 VIM实体 1400获取硬件和 /或软件实体的故障信 息,对具有关联关系的故障信息进行综合处理,从而能够实现能够实现 NFV 环境下的故障上4艮及处理。
此外, VIM实体 1400还可以包括发射电路 1404、接收电路 1405。 处理 器 1401控制 VIM实体 1400的操作, 处理器 1401还可以称为 CPU ( Central Processing Unit, 中央处理单元)。 存储器 1402可以包括只读存储器和随机 存取存储器, 并向处理器 1401提供指令和数据。 存储器 1402的一部分还可 以包括非易失性随机存取存储器( NVRAM )。 VIM实体 1400的各个组件通 过总线系统 1403耦合在一起, 其中总线系统 1403除包括数据总线之外, 还 可以包括电源总线、 控制总线和状态信号总线等。 但是为了清楚说明起见, 在图中将各种总线都标为总线系统 1403。
上述本发明实施例揭示的方法可以应用于处理器 1401 中, 或者由处理 器 1401实现。处理器 1401可能是一种集成电路芯片,具有信号的处理能力。 在实现过程中, 上述方法的各步骤可以通过处理器 1401 中的硬件的集成逻 辑电路或者软件形式的指令完成。 上述的处理器 1401可以是通用处理器、 数字信号处理器( DSP )、专用集成电路( ASIC )、现成可编程门阵列( FPGA ) 或者其他可编程逻辑器件、 分立门或者晶体管逻辑器件、 分立硬件组件。 可 以实现或者执行本发明实施例中的公开的各方法、 步骤及逻辑框图。 通用处 理器可以是微处理器或者该处理器也可以是任何常规的处理器等。 结合本发 明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成, 或 者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机 存储器, 闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、 寄存器等本领域成熟的存储介质中。 该存储介质位于存储器 1402, 处理器 1401读取存储器 1402中的信息, 结合其硬件完成上述方法的步骤。
本发明实施例提供的 VIM实体 1400硬件和 /或软件实体的故障信息,对 具有关联关系的故障信息进行综合处理, 从而能够实现能够实现 NFV环境 下的故障上报及处理。 此外, 由于对相关联的故障信息进行综合处理, 并且 通过重复报警检测对相同的故障综合信息进行删除处理,从而提高了故障处 理的效率和准确度。
图 15是本发明另一实施例的 VNFM实体的示意框图。 图 15的 VNFM 实体 1500包括处理器 1501和存储器 1502。处理器 1501和存储器 1502通过 总线系统 1503相连。
存储器 1502用于存储使得处理器 1501执行以下操作的指令: 获取虚拟 网络功能 VNF实体的包含故障实体标识和故障类型的第二故障信息, 第二 故障信息用于指示具有故障实体标识的第一 VNF实体发生故障。 根据第二 故障信息生成第二故障综合信息。根据第二故障综合信息进行故障修复或上 处理。 息,对具有关联关系的故障信息进行综合处理,从而能够实现能够实现 NFV 环境下的故障上4艮及处理。
此外, VNFM实体 1500还可以包括发射电路 1504、接收电路 1505。 处 理器 1501控制 VNFM 实体 1500的操作, 处理器 1501还可以称为 CPU ( Central Processing Unit, 中央处理单元)。 存储器 1502可以包括只读存储 器和随机存取存储器, 并向处理器 1501提供指令和数据。 存储器 1502的一 部分还可以包括非易失性随机存取存储器( NVRAM )。 VNFM实体 1500的 各个组件通过总线系统 1503耦合在一起, 其中总线系统 1503除包括数据总 线之外, 还可以包括电源总线、 控制总线和状态信号总线等。 但是为了清楚 说明起见, 在图中将各种总线都标为总线系统 1503。
上述本发明实施例揭示的方法可以应用于处理器 1501 中, 或者由处理 器 1501实现。处理器 1501可能是一种集成电路芯片,具有信号的处理能力。 在实现过程中, 上述方法的各步骤可以通过处理器 1501 中的硬件的集成逻 辑电路或者软件形式的指令完成。 上述的处理器 1501可以是通用处理器、 数字信号处理器(DSP )、专用集成电路(ASIC )、现成可编程门阵列(FPGA ) 或者其他可编程逻辑器件、 分立门或者晶体管逻辑器件、 分立硬件组件。 可 以实现或者执行本发明实施例中的公开的各方法、 步骤及逻辑框图。 通用处 理器可以是微处理器或者该处理器也可以是任何常规的处理器等。 结合本发 明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成, 或 者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机 存储器, 闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、 寄存器等本领域成熟的存储介质中。 该存储介质位于存储器 1502, 处理器 1501读取存储器 1502中的信息, 结合其硬件完成上述方法的步骤。
本发明实施例提供的 VNFM实体 1500硬件和 /或软件实体的故障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够实现 NFV环 境下的故障上报及处理。 此外, 由于对相关联的故障信息进行综合处理, 并 且通过重复报警检测对相同的故障综合信息进行删除处理,从而提高了故障 处理的效率和准确度。
图 16是本发明另一实施例的 Orchestrator 实体的示意框图。 图 16 的 Orchestrator实体 1600包括处理器 1601和存储器 1602。处理器 1601和存储 器 1602通过总线系统 1603相连。
存储器 1602用于存储使得处理器 1601执行以下操作的指令: 接收虚拟 化基础设施管理器 VIM发送的第一故障综合信息, 其中, 第一故障综合信 息包括第一故障信息, 第一故障信息包含故障实体标识和故障类型, 第一故 障信息用于指示具有故障实体标识的第一网络功能虚拟化基础设施 NFVI实 体发生故障。 根据第一故障综合信息进行故障修复或上报处理。
或者接收虚拟网络功能管理器 VNFM发送的第二故障综合信息, 其中, 第二故障综合信息包括第二故障信息, 第二故障信息包含故障实体标识和故 障类型,第二故障信息用于指示具有故障实体标识的第一虚拟网络功能 VNF 实体发生故障; 根据第二故障综合信息进行故障修复或上报处理。
本发明实施例提供的 Orchestrator实体 1600获取硬件和 /或软件实体的故 障信息, 对具有关联关系的故障信息进行综合处理, 从而能够实现能够实现 NFV环境下的故障上报及处理。
此夕卜, Orchestrator实体 1600还可以包括发射电路 1604、接收电路 1605。 处理器 1601控制 Orchestrator实体 1600的操作, 处理器 1601还可以称为 CPU ( Central Processing Unit, 中央处理单元)。 存储器 1602可以包括只读 存储器和随机存取存储器, 并向处理器 1601提供指令和数据。 存储器 1602 的一部分还可以包括非易失性随机存取存储器(NVRAM )。 Orchestrator 实 体 1600的各个组件通过总线系统 1603耦合在一起, 其中总线系统 1603除 包括数据总线之外, 还可以包括电源总线、 控制总线和状态信号总线等。 但 是为了清楚说明起见, 在图中将各种总线都标为总线系统 1603。
上述本发明实施例揭示的方法可以应用于处理器 1601 中, 或者由处理 器 1601实现。处理器 1601可能是一种集成电路芯片,具有信号的处理能力。 在实现过程中, 上述方法的各步骤可以通过处理器 1601 中的硬件的集成逻 辑电路或者软件形式的指令完成。 上述的处理器 1601可以是通用处理器、 数字信号处理器(DSP )、专用集成电路(ASIC )、现成可编程门阵列(FPGA ) 或者其他可编程逻辑器件、 分立门或者晶体管逻辑器件、 分立硬件组件。 可 以实现或者执行本发明实施例中的公开的各方法、 步骤及逻辑框图。 通用处 理器可以是微处理器或者该处理器也可以是任何常规的处理器等。 结合本发 明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成, 或 者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机 存储器, 闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、 寄存器等本领域成熟的存储介质中。 该存储介质位于存储器 1602, 处理器 1601读取存储器 1602中的信息, 结合其硬件完成上述方法的步骤。
本发明实施例提供的 Orchestrator实体 1600硬件和 /或软件实体的故障信 息,对具有关联关系的故障信息进行综合处理,从而能够实现能够实现 NFV 环境下的故障上报及处理。 此外, 由于对相关联的故障信息进行综合处理, 并且通过重复报警检测对相同的故障综合信息进行删除处理,从而提高了故 障处理的效率和准确度。
本领域普通技术人员可以意识到, 结合本文中所公开的实施例中描述的 各方法步骤和单元, 能够以电子硬件、 计算机软件或者二者的结合来实现, 为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性 地描述了各实施例的步骤及组成。 这些功能究竟以硬件还是软件方式来执 行, 取决于技术方案的特定应用和设计约束条件。 本领域普通技术人员可以 对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应 认为超出本发明的范围。
结合本文中所公开的实施例描述的方法或步骤可以用硬件、处理器执行 的软件程序,或者二者的结合来实施。软件程序可以置于随机存储器( RAM )、 内存、 只读存储器(ROM )、 电可编程 ROM、 电可擦除可编程 ROM、 寄存 器、 硬盘、 可移动磁盘、 CD-ROM, 或技术领域内所公知的任意其它形式的 存储介质中。 但本发明并不限于此。 在不脱离本发明的精神和实质的前提下, 本领域普通 技术人员可以对本发明的实施例进行各种等效的修改或替换, 而这些修改或 替换都应在本发明的涵盖范围内。

Claims

权利要求
1. 一种故障管理方法, 其特征在于, 包括:
虚拟化基础设施管理器 VIM获取网络功能虚拟化基础设施 NFVI实体的 包含故障实体标识和故障类型的第一故障信息, 所述第一故障信息用于指示 具有所述故障实体标识的第一 NFVI实体发生故障;
所述 VIM根据所述第一故障信息生成第一故障综合信息, 所述第一故 障综合信息包含所述第一故障信息和所述第一故障信息的关联故障信息; 所述 VIM根据所述第一故障综合信息进行故障修复或上报处理。
2. 根据权利要求 1所述的方法, 其特征在于, 所述 VIM获取 NFVI实 体的包含故障实体标识和故障类型的第一故障信息, 包括:
接收所述第一 NFVI实体发送的所述第一故障信息; 或者
确定所述第一 NFVI实体发生故障, 并根据所述第一 NFVI实体发生的 故障生成所述第一故障信息。
3.根据权利要求 1或 2所述的方法, 其特征在于, 所述第一 NFVI实体 为所述 NFVI实体中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟机管 理器或虚拟机 VM实体, 所述 VIM根据所述第一故障信息生成第一故障综 合信息, 包括:
确定与所述第一 NFVI实体相关联的 NFVI实体发送的故障信息为所述 第一故障信息的关联故障信息;
生成包含有所述第一故障信息和所述关联故障信息的第一故障综合信 息。
4.根据权利要求 1或 2所述的方法, 其特征在于, 所述 VIM根据所述 第一故障综合信息进行故障修复或上报处理, 包括:
根据所述第一故障综合信息中的第一故障信息的故障类型或者所述关 联故障信息的故障类型, 确定所述 VIM是否包含与所述第一故障信息的故 障类型或者所述关联故障信息的故障类型相对应的故障修复策略;
在所述 VIM 包含与所述第一故障信息的故障类型或者所述关联故障信 息的故障类型相对应的故障修复策略时,根据所述故障修复策略修复所述第 一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者 在所述 VIM不包含与所述第一故障信息的故障类型或者所述关联故障 信息的故障类型相对应的故障修复策略时, 向 VNFM发送所述第一故障综 合信息或者向编排器发送所述第一故障综合信息。
5.根据权利要求 4所述的方法, 其特征在于,
所述根据所述第一故障综合信息中的第一故障信息的故障类型或者所 述关联故障信息的故障类型, 确定所述 VIM是否包含与所述第一故障信息 的故障类型或者所述关联故障信息的故障类型相对应的故障修复策略, 包 括:
在所述第一 NFVI实体和与所述第一 NFVI实体相关联的 NFVI实体中 确定优先级最高的 NFVI实体, 其中, HW的优先级高于 Host OS的优先级, Host OS的优先级高于虚拟机管理器的优先级, 虚拟机管理器的优先级高于 VM的优先级;
根据所述优先级最高的 NFVI实体的故障类型确定所述 VIM是否包含相 对应的故障修复策略;
在所述 VIM包含与所述优先级最高的 NFVI实体的故障类型相对应的故 障修复策略时, 根据所述故障修复策略修复所述优先级最高的 NFVI实体的 故障。
6.根据权利要求 4所述的方法,其特征在于,所述根据所述故障修复策 略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体 的故障之后, 还包括:
在所述故障修复成功时, 向所述编排器发送成功指示消息;
在所述故障修复失败时, 向所述 VNFM发送所述第一故障综合信息或 者向所述编排器发送所述第一故障综合信息。
7.根据权利要求 6所述的方法, 其特征在于, 所述向 VNFM发送所述 第一故障综合信息之后, 还包括:
接收所述 VNFM发送的用于指示所述 VNFM无法处理所述第一故障综 合信息的指示消息;
向编排器发送所述第一故障综合信息。
8.根据权利要求 6所述的方法,其特征在于,所述向编排器发送所述第 一故障综合信息之前, 还包括:
向 VNFM请求与所述第一 NFVI实体相关联的 VNF实体的故障信息; 将所述与所述第一 NFVI实体相关联的 VNF实体的故障信息加入所述 第一故障综合信息。
9.根据权利要求 1至 8中任意一项所述的方法,其特征在于,所述方法 还包括:
接收所述 VNFM发送的请求信息, 所述请求信息用于向所述 VIM请求 与发生故障的 VNF实体相关联的 NFVI实体的故障信息;
向所述 VNFM发送所述与发生故障的 VNF实体相关联的 NFVI实体的 故障信息。
10. 根据权利要求 1至 8中任意一项所述的方法,其特征在于,所述 VIM 根据所述第一故障信息生成第一故障综合信息之后, 还包括:
根据所述第一故障综合信息检测所述 VIM是否包含与所述第一故障综 合信息相同的故障综合信息;
在所述 VIM 包含与所述第一故障综合信息相同的故障综合信息时, 删 除所述第一故障综合信息。
11.根据权利要求 1至 8中任意一项所述的方法, 其特征在于, 所述第 一故障信息还被用于向运营和业务支撑系统 OSS/BSS 上报, 以便于所述 OSS/BSS监控并呈现所述第一故障信息。
12. 根据权利要求 1至 11中任意一项所述的方法, 其特征在于, 所述第一故障信息还包括以下至少一项: 运行状态、 故障时间; 所述第一故障综合信息还包括故障状态信息, 所述故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
13. 一种故障管理方法, 其特征在于, 包括:
虚拟网络功能管理器 VNFM获取虚拟网络功能 VNF实体的包含故障实 体标识和故障类型的第二故障信息,所述第二故障信息用于指示具有所述故 障实体标识的第一 VNF实体发生故障;
所述 VNFM根据所述第二故障信息生成第二故障综合信息;
所述 VNFM根据所述第二故障综合信息进行故障修复或上报处理。
14. 根据权利要求 13所述的方法, 其特征在于, 所述 VNFM获取 VNF 实体的包含故障实体标识和故障类型的第二故障信息, 包括:
接收所述第一 VNF实体发送的所述第二故障信息; 或者
确定所述第一 VNF实体发生故障, 并根据所述第一 VNF实体发生的故 障生成所述第二故障信息。
15. 根据权利要求 13或 14所述的方法, 其特征在于, 所述 VNFM根据 所述第二故障信息生成第二故障综合信息, 包括:
确定与所述第一 VNF实体相关联的 VNF实体发送的故障信息为所述第 二故障信息的关联故障信息;
生成包含有所述第二故障信息和所述关联故障信息的第二故障综合信 息。
16.根据权利要求 13或 14所述的方法, 其特征在于, 所述 VNFM根据 所述第二故障综合信息进行故障修复或上报处理, 包括:
根据所述第二故障综合信息中的第二故障信息的故障类型或者所述关 联故障信息的故障类型, 确定所述 VNFM是否包含与所述第二故障信息的 故障类型或者所述关联故障信息的故障类型相对应的故障修复策略;
在所述 VNFM 包含与所述第二故障信息的故障类型或者所述关联故障 信息的故障类型相对应的故障修复策略时,根据所述故障修复策略修复所述 第一 VNF实体和 /或与所述第一 VNF实体相关联的 VNF实体的故障; 或者 在所述 VNFM不包含与所述第二故障信息的故障类型或者所述关联故 障信息的故障类型相对应的故障修复策略时, 向编排器发送所述第二故障综 合信息。
17.根据权利要求 16所述的方法,其特征在于,所述根据所述故障修复 策略修复所述第一 VNF实体和 /或与所述第一 VNF实体相关联的 VNF实体 的故障之后, 还包括:
在所述故障修复成功时, 向所述编排器发送成功指示消息;
在所述故障修复失败时, 向所述编排器发送所述第二故障综合信息。
18.根据权利要求 17所述的方法,其特征在于,所述向所述编排器发送 所述第二故障综合信息之前, 还包括:
向虚拟化基础设施管理器 VIM 请求与所述第一 VNF 实体相关联的 NFVI实体的故障信息,其中所述 NFVI实体为所述 NFVI中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟机管理器或虚拟机 VM实体;
将所述与所述第一 VNF实体相关联的 NFVI实体的故障信息加入所述 第二故障综合信息。
19. 根据权利要求 13所述的方法, 其特征在于, 所述方法还包括: 接收 VIM发送的第一故障综合信息, 所述第一故障综合信息包含所述 第一故障信息和所述第一故障信息的关联故障信息, 所述第一故障信息用于 指示第一 NFVI实体发生故障;
确定所述 VNFM是否包含与所述第一故障综合信息中的第一故障信息 的故障类型或者所述关联故障信息的故障类型相对应的故障修复策略;
在所述 VNFM 包含与所述第一故障信息的故障类型或者所述关联故障 信息的故障类型相对应的故障修复策略时,根据所述故障修复策略修复所述 第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或 者
在所述 VNFM不包含与所述第一故障信息的故障类型或者所述关联故 障信息的故障类型相对应的故障修复策略时, 向编排器发送所述第一故障综 合信息, 或者向所述 VIM发送用于指示所述 VNFM无法处理所述第一故障 综合信息的指示消息, 以便于所述 VIM向所述编排器发送所述第一故障综 合信息。
20.根据权利要求 19所述的方法, 其特征在于, 所述接收 VIM发送的 第一故障综合信息之后, 还包括:
根据所述第一故障综合信息确定与所述第一 NFVI 实体和 /或与所述第 一 NFVI实体相关联的 NFVI实体相关联的所述第一 VNF实体的故障信息; 将所述第一 VNF实体的故障信息加入所述第一故障综合信息, 以便于 所述所述 VNFM对所述第一故障综合信息进行修复或上报处理。
21.根据权利要求 13至 20中任意一项所述的方法, 其特征在于, 所述 VNFM根据所述第二故障综合信息进行修复或上报处理之后, 还包括: 根据所述第二故障综合信息检测所述 VNFM是否包含与所述第二故障 综合信息相同的故障综合信息;
在所述 VNFM 包含与所述第二故障综合信息相同的故障综合信息时, 删除所述第二故障综合信息。
22.根据权利要求 13至 20中任意一项所述的方法, 其特征在于, 所述 方法还包括:
接收所述 VIM发送的请求信息, 所述请求信息用于向所述 VNFM请求 与发生故障的 NFVI实体相关联的 VNF实体的故障信息;
向所述 VIM发送所述与发生故障的 NFVI实体相关联的 VNF实体的故 障信息。
23.根据权利要求 13至 20中任意一项所述的方法, 其特征在于, 所述 第二故障信息还被用于向运营和业务支撑系统 OSS/BSS上报, 以便于所述 OSS/BSS监控并呈现所述第二故障信息。
24. 根据权利要求 13至 23中任意一项所述的方法, 其特征在于, 所述第二故障信息还包括以下至少一项: 运行状态、 故障时间; 所述第二故障综合信息还包括故障状态信息, 所述故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
25. 一种故障管理方法, 其特征在于, 包括:
编排器接收虚拟化基础设施管理器 VIM发送的第一故障综合信息, 其 中, 所述第一故障综合信息包括第一故障信息, 所述第一故障信息包含故障 实体标识和故障类型,所述第一故障信息用于指示具有所述故障实体标识的 第一网络功能虚拟化基础设施 NFVI实体发生故障;
所述编排器根据所述第一故障综合信息进行故障修复或上报处理。
26. 根据权利要求 25所述的方法,其特征在于,所述第一故障综合信息 还包括:
与所述第一 NFVI实体相关联的 NFVI实体的故障信息; 和 /或 与所述第一 NFVI实体相关联的虚拟网络功能 VNF实体的故障信息。
27. 根据权利要求 25或 26所述的方法, 其特征在于, 所述编排器根据 所述第一故障综合信息进行故障修复或上报处理, 包括:
根据所述第一故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者
在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向运营 和业务支撑系统 OSS/BSS发送所述第一故障综合信息。
28. 根据权利要求 25或 26所述的方法, 其特征在于, 所述编排器根据 所述第一故障综合信息进行故障修复或上报处理, 包括:
根据所述第一故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 NFVI 实体和与所述第一 NFVI 实体相关联的 NFVI实体的故障和与所述第一 NFVI实体相关联的 VNF实体的故障; 或者 在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向 OSS/BSS发送所述第一故障综合信息。
29. 根据权利要求 25至 28中任意一项所述的方法, 其特征在于, 所述 编排器根据所述第一故障综合信息进行故障修复或上报处理之前, 还包括: 根据所述第一故障综合信息检测所述编排器是否包含与所述第一故障 综合信息相同的故障综合信息;
在所述编排器包含与所述第一故障综合信息相同的故障综合信息时,删 除所述第一故障综合信息。
30. 根据权利要求 25至 29中任意一项所述的方法, 其特征在于, 所述第一故障信息还包括以下至少一项: 运行状态、 故障时间; 所述第一故障综合信息还包括故障状态信息, 所述故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
31. 一种故障管理方法, 其特征在于, 包括:
编排器接收虚拟网络功能管理器 VNFM发送的第二故障综合信息, 其 中, 所述第二故障综合信息包括第二故障信息, 所述第二故障信息包含故障 实体标识和故障类型,所述第二故障信息用于指示具有所述故障实体标识的 第一虚拟网络功能 VNF实体发生故障;
所述编排器根据所述第二故障综合信息进行故障修复或上报处理。
32. 根据权利要求 31所述的方法,其特征在于,所述第二故障综合信息 还包括:
与所述第一 VNF实体相关联的 VNF实体的故障信息; 和 /或
与所述第一 VNF实体相关联的虚拟化基础设施管理 NFVI实体的故障 信息。
33. 根据权利要求 31或 32所述的方法, 其特征在于, 所述编排器根据 所述第二故障综合信息进行故障修复或上报处理, 包括:
根据所述第二故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 VNF实体和 /或与所述第一 VNF实体相关联的 VNF实体的故障; 或者 在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向运营 和业务支撑系统 OSS/BSS发送所述第二故障综合信息。
34. 根据权利要求 31或 32所述的方法, 其特征在于, 所述编排器根据 所述第二故障综合信息进行故障修复或上报处理, 包括:
根据所述第二故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 VNF实体和与所述第一 VNF实体相关联的 VNF 实体的故障和与所述第一 VNF实体相关联的 NFVI实体的故障; 或者
在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向
OSS/BSS发送所述第二故障综合信息。
35. 根据权利要求 31至 34中任意一项所述的方法, 其特征在于, 所述 编排器根据所述第二故障综合信息进行故障修复或上报处理之前, 还包括: 根据所述第二故障综合信息检测所述编排器是否包含与所述第二故障 综合信息相同的故障综合信息;
在所述编排器包含与所述第二故障综合信息相同的故障综合信息时,删 除所述第二故障综合信息。
36. 根据权利要求 31至 35中任意一项所述的方法, 其特征在于, 所述第二故障信息还包括以下至少一项: 运行状态、 故障时间; 所述第二故障综合信息还包括故障状态信息, 所述故障状态包含未处 理, 处理中, 已爹复和未爹复中的至少一种。
37. 一种虚拟化基础设施管理器, 其特征在于, 包括:
获取单元, 用于获取网络功能虚拟化基础设施 NFVI实体的包含故障实 体标识和故障类型的第一故障信息,所述第一故障信息用于指示具有所述故 障实体标识的第一 NFVI实体发生故障;
生成单元, 用于根据所述第一故障信息生成第一故障综合信息, 所述第 一故障综合信息包含所述第一故障信息和所述第一故障信息的关联故障信 息;
处理单元, 用于根据所述第一故障综合信息进行故障修复或上报处理。
38. 根据权利要求 37所述的管理器, 其特征在于, 所述管理器还包括 确定单元和接收单元, 所述获取单元具体用于: 通过所述接收单元接收所述第一 NFVI实体发送的所述第一故障信息; 或者
通过所述确定单元确定所述第一 NFVI实体发生故障, 并根据所述第一 NFVI实体发生的故障生成所述第一故障信息。
39.根据权利要求 37或 38所述的管理器,其特征在于,所述第一 NFVI 实体为所述 NFVI实体中的任意一个硬件 HW、 主操作系统 Host 0S、 虚拟 机管理器或虚拟机 VM实体, 所述生成单元具体用于:
通过所述确定单元确定与所述第一 NFVI实体相关联的 NFVI实体发送 的故障信息为所述第一故障信息的关联故障信息;
生成包含有所述第一故障信息和所述关联故障信息的第一故障综合信 息。
40.根据权利要求 37或 38所述的管理器, 其特征在于, 所述处理单元 包括发送单元, 所述处理单元具体用于:
根据所述第一故障综合信息中的第一故障信息的故障类型或者所述关 联故障信息的故障类型, 通过所述确定单元确定所述 VIM是否包含与所述 第一故障信息的故障类型或者所述关联故障信息的故障类型相对应的故障 修复策略;
在所述 VIM 包含与所述第一故障信息的故障类型或者所述关联故障信 息的故障类型相对应的故障修复策略时,根据所述故障修复策略修复所述第 一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者 在所述 VIM不包含与所述第一故障信息的故障类型或者所述关联故障 信息的故障类型相对应的故障修复策略时, 通过所述发送单元向 VNFM发 送所述第一故障综合信息或者向编排器发送所述第一故障综合信息。
41.根据权利要求 40所述的管理器,其特征在于,所述处理单元具体用 于:
通过所述确定单元在所述第一 NFVI实体和与所述第一 NFVI实体相关 联的 NFVI实体中确定优先级最高的 NFVI实体, 其中, HW的优先级高于 Host OS的优先级, Host OS的优先级高于虚拟机管理器的优先级, 虚拟机 管理器的优先级高于 VM的优先级;
根据所述优先级最高的 NFVI实体的故障类型, 通过所述确定单元确定 所述 VIM是否包含相对应的故障修复策略; 在所述 VIM包含与所述优先级最高的 NFVI实体的故障类型相对应的故 障修复策略时, 根据所述故障修复策略修复所述优先级最高的 NFVI实体的 故障。
42. 根据权利要求 40所述的管理器,其特征在于,所述发送单元具体用 于:
在所述故障修复成功时, 向所述编排器发送成功指示消息;
在所述故障修复失败时, 向所述 VNFM发送所述第一故障综合信息或 者向所述编排器发送所述第一故障综合信息。
43. 根据权利要求 42所述的管理器,其特征在于,所述接收单元还用于: 接收所述 VNFM发送的用于指示所述 VNFM无法处理所述第一故障综 合信息的指示消息;
所述发送单元还用于: 向编排器发送所述第一故障综合信息。
44. 根据权利要求 42所述的管理器,其特征在于,所述处理单元还用于: 向 VNFM请求与所述第一 NFVI实体相关联的 VNF实体的故障信息; 将所述与所述第一 NFVI实体相关联的 VNF实体的故障信息加入所述 第一故障综合信息。
45. 根据权利要求 37至 44中任意一项所述的管理器, 所述接收单元还 用于:
接收所述 VNFM发送的请求信息, 所述请求信息用于向所述 VIM请求 与发生故障的 VNF实体相关联的 NFVI实体的故障信息; 关联的 NFVI实体的故障信息。
46. 根据权利要求 37至 44中任意一项所述的管理器, 其特征在于, 所 述管理器还包括检测单元和删除单元, 所述检测单元具体用于:
根据所述第一故障综合信息检测所述 VIM是否包含与所述第一故障综 合信息相同的故障综合信息;
所述删除单元具体用于在所述 VIM 包含与所述第一故障综合信息相同 的故障综合信息时, 删除所述第一故障综合信息。
47. —种虚拟网络功能管理器, 其特征在于, 包括:
获取单元, 用于获取虚拟网络功能 VNF实体的包含故障实体标识和故 障类型的第二故障信息, 所述第二故障信息用于指示具有所述故障实体标识 的第一 VNF实体发生故障;
生成单元, 用于根据所述第二故障信息生成第二故障综合信息; 处理单元, 用于根据所述第二故障综合信息进行故障修复或上报处理。
48. 根据权利要求 47所述的管理器, 其特征在于, 所述管理器还包括 确定单元和接收单元, 所述获取单元具体用于:
通过所述接收单元接收所述第一 VNF实体发送的所述第二故障信息; 或者
通过所述确定单元确定所述第一 VNF实体发生故障, 并根据所述第一 VNF实体发生的故障通过所述生成单元生成所述第二故障信息。
49. 根据权利要求 47或 48所述的管理器, 其特征在于, 所述生成单元 具体用于:
通过所述确定单元确定与所述第一 VNF实体相关联的 VNF实体发送的 故障信息为所述第二故障信息的关联故障信息;
生成包含有所述第二故障信息和所述关联故障信息的第二故障综合信 息。
50. 根据权利要求 47或 48所述的管理器, 其特征在于, 所述处理单元 包括发送单元, 所述处理单元具体用于:
根据所述第二故障综合信息中的第二故障信息的故障类型或者所述关 联故障信息的故障类型, 通过所述确定单元确定所述 VNFM是否包含与所 述第二故障信息的故障类型或者所述关联故障信息的故障类型相对应的故 障修复策略;
在所述 VNFM 包含与所述第二故障信息的故障类型或者所述关联故障 信息的故障类型相对应的故障修复策略时,根据所述故障修复策略修复所述 第一 VNF实体和 /或与所述第一 VNF实体相关联的 VNF实体的故障; 或者 在所述 VNFM不包含与所述第二故障信息的故障类型或者所述关联故 障信息的故障类型相对应的故障修复策略时,通过所述发送单元向编排器发 送所述第二故障综合信息。
51.根据权利要求 50所述的管理器,其特征在于,所述发送单元具体用 于:
在所述故障修复成功时, 向所述编排器发送成功指示消息;
在所述故障修复失败时, 向所述编排器发送所述第二故障综合信息。
52. 根据权利要求 51所述的管理器,其特征在于,所述处理单元还用于: 向虚拟化基础设施管理器 VIM 请求与所述第一 VNF 实体相关联的
NFVI实体的故障信息,其中所述 NFVI实体为所述 NFVI中的任意一个硬件 HW、 主操作系统 Host OS、 虚拟机管理器或虚拟机 VM实体;
将所述与所述第一 VNF实体相关联的 NFVI实体的故障信息加入所述 第二故障综合信息。
53. 根据权利要求 47所述的管理器,其特征在于,所述处理单元还用于: 接收 VIM发送的第一故障综合信息, 所述第一故障综合信息包含所述 第一故障信息和所述第一故障信息的关联故障信息, 所述第一故障信息用于 指示第一 NFVI实体发生故障;
确定所述 VNFM是否包含与所述第一故障综合信息中的第一故障信息 的故障类型或者所述关联故障信息的故障类型相对应的故障修复策略;
在所述 VNFM 包含与所述第一故障信息的故障类型或者所述关联故障 信息的故障类型相对应的故障修复策略时,根据所述故障修复策略修复所述 第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或 者
在所述 VNFM不包含与所述第一故障信息的故障类型或者所述关联故 障信息的故障类型相对应的故障修复策略时, 向编排器发送所述第一故障综 合信息, 或者向所述 VIM发送用于指示所述 VNFM无法处理所述第一故障 综合信息的指示消息, 以便于所述 VIM向所述编排器发送所述第一故障综 合信息。
54. 根据权利要求 53所述的管理器,其特征在于,所述处理单元还具体 用于:
根据所述第一故障综合信息确定与所述第一 NFVI 实体和 /或与所述第 一 NFVI实体相关联的 NFVI实体相关联的所述第一 VNF实体的故障信息; 将所述第一 VNF实体的故障信息加入所述第一故障综合信息, 以便于 所述所述 VNFM对所述第一故障综合信息进行修复或上报处理。
55.根据权利要求 47至 54中任意一项所述的管理器, 其特征在于, 所 述管理器还包括检测单元和删除单元, 所述检测单元具体用于:
根据所述第二故障综合信息检测所述 VNFM是否包含与所述第二故障 综合信息相同的故障综合信息; 所述删除单元具体用于在所述 VNFM 包含与所述第二故障综合信息相 同的故障综合信息时, 删除所述第二故障综合信息。
56.根据权利要求 47至 54中任意一项所述的管理器, 其特征在于, 所 述接收单元还用于:
接收所述 VIM发送的请求信息, 所述请求信息用于向所述 VNFM请求 与发生故障的 NFVI实体相关联的 VNF实体的故障信息;
所述发送单元还用于:向所述 VIM发送所述与发生故障的 NFVI实体相 关联的 VNF实体的故障信息。
57. 一种编排器, 其特征在于, 包括:
接收单元, 用于接收虚拟化基础设施管理器 VIM发送的第一故障综合 信息, 其中, 所述第一故障综合信息包括第一故障信息, 所述第一故障信息 包含故障实体标识和故障类型, 所述第一故障信息用于指示具有所述故障实 体标识的第一网络功能虚拟化基础设施 NFVI实体发生故障;
处理单元, 用于根据所述第一故障综合信息进行故障修复或上报处理。
58.根据权利要求 57所述的编排器,其特征在于,所述第一故障综合信 息还包括:
与所述第一 NFVI实体相关联的 NFVI实体的故障信息; 和 /或
与所述第一 NFVI实体相关联的虚拟网络功能 VNF实体的故障信息。
59. 根据权利要求 57或 58所述的编排器, 其特征在于, 所述处理单元 具体用于:
根据所述第一故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 NFVI实体和 /或与所述第一 NFVI实体相关联的 NFVI实体的故障; 或者
在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向运营 和业务支撑系统 OSS/BSS发送所述第一故障综合信息。
60. 根据权利要求 57或 58所述的编排器, 其特征在于, 所述处理单元 具体用于:
根据所述第一故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略; 在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 NFVI 实体和与所述第一 NFVI 实体相关联的 NFVI实体的故障和与所述第一 NFVI实体相关联的 VNF实体的故障; 或者 在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向 OSS/BSS发送所述第一故障综合信息。
61.根据权利要求 57至 60中任意一项所述的编排器, 其特征在于, 所 述编排器还包括检测单元和删除单元, 所述检测单元用于:
根据所述第一故障综合信息检测所述编排器是否包含与所述第一故障 综合信息相同的故障综合信息;
所述删除单元用于在所述编排器包含与所述第一故障综合信息相同的 故障综合信息时, 删除所述第一故障综合信息。
62. 一种编排器, 其特征在于, 包括:
接收单元, 用于接收虚拟网络功能管理器 VNFM发送的第二故障综合 信息, 其中, 所述第二故障综合信息包括第二故障信息, 所述第二故障信息 包含故障实体标识和故障类型, 所述第二故障信息用于指示具有所述故障实 体标识的第一虚拟网络功能 VNF实体发生故障;
处理单元, 用于根据所述第二故障综合信息进行故障修复或上报处理。
63. 根据权利要求 62所述的编排器,其特征在于,所述第二故障综合信 息还包括:
与所述第一 VNF实体相关联的 VNF实体的故障信息; 和 /或
与所述第一 VNF实体相关联的虚拟化基础设施管理 NFVI实体的故障 信息。
64. 根据权利要求 62或 63所述的编排器, 其特征在于, 所述处理单元 具体用于:
根据所述第二故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 VNF实体和 /或与所述第一 VNF实体相关联的
VNF实体的故障; 或者
在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向运营 和业务支撑系统 OSS/BSS发送所述第二故障综合信息。
65. 根据权利要求 62或 63所述的编排器, 其特征在于, 所述处理单元 具体用于:
根据所述第二故障综合信息中的故障类型,确定所述编排器是否包含与 所述故障类型相对应的故障修复策略;
在所述编排器包含与所述故障类型相对应的故障修复策略时,根据所述 故障修复策略修复所述第一 VNF实体和与所述第一 VNF实体相关联的 VNF 实体的故障和与所述第一 VNF实体相关联的 NFVI实体的故障; 或者
在所述编排器不包含与所述故障类型相对应的故障修复策略时, 向 OSS/BSS发送所述第二故障综合信息。
66. 根据权利要求 62至 65中任意一项所述的编排器, 其特征在于, 所 述编排器还包括检测单元和删除单元, 所述检测单元用于:
根据所述第二故障综合信息检测所述编排器是否包含与所述第二故障 综合信息相同的故障综合信息;
所述删除单元用于在所述编排器包含与所述第二故障综合信息相同的 故障综合信息时, 删除所述第二故障综合信息。
PCT/CN2013/084686 2013-09-30 2013-09-30 故障管理的方法、实体和系统 WO2015042937A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
KR1020167010730A KR101908465B1 (ko) 2013-09-30 2013-09-30 결함 관리 방법, 엔티티 및 시스템
EP17191853.5A EP3322125B1 (en) 2013-09-30 2013-09-30 Fault management in a virtualized infrastructure
CN201810143222.7A CN108418711B (zh) 2013-09-30 2013-09-30 故障管理的存储介质
PCT/CN2013/084686 WO2015042937A1 (zh) 2013-09-30 2013-09-30 故障管理的方法、实体和系统
RU2016117218A RU2644146C2 (ru) 2013-09-30 2013-09-30 Способ, устройство и система управления обработкой отказов
BR112016006902-1A BR112016006902B1 (pt) 2013-09-30 2013-09-30 Método de gerenciamento de falhas e gerenciador de funções de rede virtualizada
JP2016517300A JP6212207B2 (ja) 2013-09-30 2013-09-30 障害管理方法、仮想化ネットワーク機能マネージャ(vnfm)、及びプログラム
EP13894185.1A EP3024174B1 (en) 2013-09-30 2013-09-30 Fault management method and fault management entity for virtualized network functions
CN201380002104.XA CN104685830B (zh) 2013-09-30 2013-09-30 故障管理的方法、实体和系统
US15/084,548 US10073729B2 (en) 2013-09-30 2016-03-30 Fault management method, entity, and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/084686 WO2015042937A1 (zh) 2013-09-30 2013-09-30 故障管理的方法、实体和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/084,548 Continuation US10073729B2 (en) 2013-09-30 2016-03-30 Fault management method, entity, and system

Publications (1)

Publication Number Publication Date
WO2015042937A1 true WO2015042937A1 (zh) 2015-04-02

Family

ID=52741866

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/084686 WO2015042937A1 (zh) 2013-09-30 2013-09-30 故障管理的方法、实体和系统

Country Status (8)

Country Link
US (1) US10073729B2 (zh)
EP (2) EP3024174B1 (zh)
JP (1) JP6212207B2 (zh)
KR (1) KR101908465B1 (zh)
CN (2) CN108418711B (zh)
BR (1) BR112016006902B1 (zh)
RU (1) RU2644146C2 (zh)
WO (1) WO2015042937A1 (zh)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016184021A1 (zh) * 2015-05-21 2016-11-24 中兴通讯股份有限公司 一种虚拟化网络功能业务故障的处理方法及装置
CN106330501A (zh) * 2015-06-26 2017-01-11 中兴通讯股份有限公司 一种故障关联方法和装置
WO2017012381A1 (zh) * 2015-07-20 2017-01-26 华为技术有限公司 一种生命周期管理方法及装置
CN106464533A (zh) * 2015-04-09 2017-02-22 华为技术有限公司 基于网络功能虚拟化的故障处理方法和装置
CN106533714A (zh) * 2015-09-09 2017-03-22 中兴通讯股份有限公司 重新实例化虚拟网络功能的方法和装置
WO2017078790A1 (en) * 2015-11-02 2017-05-11 Intel IP Corporation Restoring virtual network function (vnf) performance via vnf reset of lifecycle management
CN106878096A (zh) * 2015-12-10 2017-06-20 中国电信股份有限公司 Vnf状态检测通告方法、装置以及系统
WO2017157903A1 (en) * 2016-03-14 2017-09-21 Nokia Solutions And Networks Oy End-to-end virtualized network function healing
CN107623596A (zh) * 2017-09-15 2018-01-23 郑州云海信息技术有限公司 一种nfv平台中启动测试网元定位排查故障的方法
JP2018025968A (ja) * 2016-08-10 2018-02-15 日本電信電話株式会社 復旧制御システム及び方法
WO2018040042A1 (zh) * 2016-08-31 2018-03-08 华为技术有限公司 一种告警信息上报方法及装置
CN108141375A (zh) * 2015-08-10 2018-06-08 诺基亚通信公司 云部署中的自动征兆数据收集
US10083098B1 (en) 2016-06-07 2018-09-25 Sprint Communications Company L.P. Network function virtualization (NFV) virtual network function (VNF) crash recovery
JP2018533280A (ja) * 2015-09-22 2018-11-08 華為技術有限公司Huawei Technologies Co.,Ltd. トラブルシューティング方法及び装置
CN109074287A (zh) * 2016-05-04 2018-12-21 阿尔卡特朗讯公司 基础设施资源状态
EP3386170A4 (en) * 2015-12-31 2018-12-26 Huawei Technologies Co., Ltd. Fault processing method, device and system
CN109995569A (zh) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质
CN109995568A (zh) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质
CN110601905A (zh) * 2019-09-29 2019-12-20 苏州浪潮智能科技有限公司 一种故障检测方法和装置
US10644952B2 (en) 2015-06-30 2020-05-05 Huawei Technologies Co., Ltd. VNF failover method and apparatus

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10606718B1 (en) * 2013-12-19 2020-03-31 Amdocs Development Limited System, method, and computer program for managing fault recovery in network function virtualization (Nfv) based networks
US9645899B1 (en) * 2013-12-19 2017-05-09 Amdocs Software Systems Limited System, method, and computer program for managing fault recovery in network function virtualization (NFV) based networks
EP3089032A4 (en) * 2013-12-27 2017-01-18 NTT Docomo, Inc. Management system, overall management node, and management method
US10481953B2 (en) * 2013-12-27 2019-11-19 Ntt Docomo, Inc. Management system, virtual communication-function management node, and management method for managing virtualization resources in a mobile communication network
CN105165054B (zh) * 2014-01-21 2019-05-24 华为技术有限公司 网络服务故障处理方法,服务管理系统和系统管理模块
CN105591784A (zh) * 2014-10-24 2016-05-18 中兴通讯股份有限公司 告警处理方法及装置
US9946614B2 (en) * 2014-12-16 2018-04-17 At&T Intellectual Property I, L.P. Methods, systems, and computer readable storage devices for managing faults in a virtual machine network
CN105873114B (zh) * 2015-01-21 2020-12-11 中兴通讯股份有限公司 一种虚拟网络功能性能监控的方法及相应的系统
JP2018517345A (ja) * 2015-07-30 2018-06-28 ホアウェイ・テクノロジーズ・カンパニー・リミテッド 可用性カウント装置および方法
CN105049293B (zh) * 2015-08-21 2018-03-30 中国联合网络通信集团有限公司 监控的方法及装置
WO2017031698A1 (zh) 2015-08-25 2017-03-02 华为技术有限公司 一种获取vnf信息的方法、装置及系统
WO2017066940A1 (zh) * 2015-10-21 2017-04-27 华为技术有限公司 一种网络虚拟化环境下的监控方法、监控装置和网络节点
CN105847237B (zh) * 2016-03-15 2019-01-15 中国联合网络通信集团有限公司 一种基于nfv的安全管理方法和装置
EP3439249B1 (en) * 2016-03-31 2022-07-20 Nec Corporation Network system, management method and device for same, and server
EP3472971B1 (en) * 2016-06-16 2022-09-14 Telefonaktiebolaget LM Ericsson (publ) Technique for resolving a link failure
IL248285B (en) * 2016-10-10 2018-01-31 Adva Optical Networking Israel Ltd A method and system for the secure operation of a virtual network
WO2018128804A1 (en) * 2017-01-06 2018-07-12 Intel IP Corporation Measurement job suspension and resumption in network function virtualization
CN108347339B (zh) * 2017-01-24 2020-06-16 华为技术有限公司 一种业务恢复方法及装置
JP6778151B2 (ja) * 2017-06-20 2020-10-28 日本電信電話株式会社 ネットワーク管理装置およびネットワーク管理方法
EP3503614B1 (en) * 2017-12-22 2022-06-08 Deutsche Telekom AG Devices and methods for monitoring and handling faults in a network slice of a communication network
KR102019927B1 (ko) * 2018-09-12 2019-11-04 숭실대학교산학협력단 네트워크 기능 가상화 운영 장치 및 방법
WO2020091776A1 (en) * 2018-11-01 2020-05-07 Hewlett Packard Enterprise Development Lp Virtual network function response to a service interruption
US10979321B2 (en) * 2018-12-10 2021-04-13 Nec Corporation Method and system for low-latency management and orchestration of virtualized resources
US10887156B2 (en) * 2019-01-18 2021-01-05 Vmware, Inc. Self-healing Telco network function virtualization cloud
CN112860496A (zh) 2019-11-27 2021-05-28 华为技术有限公司 故障修复操作推荐方法、装置及存储介质
CN113541988B (zh) * 2020-04-17 2022-10-11 华为技术有限公司 一种网络故障的处理方法及装置
CN114363144B (zh) * 2020-09-28 2023-06-27 华为技术有限公司 一种面向分布式系统的故障信息关联上报方法及相关设备
WO2022264289A1 (ja) * 2021-06-15 2022-12-22 楽天モバイル株式会社 ネットワーク管理装置、ネットワーク管理方法およびプログラム
WO2023228233A1 (ja) * 2022-05-23 2023-11-30 楽天モバイル株式会社 障害発生時における自動復旧のためのネットワーク管理

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332889A1 (en) * 2009-06-25 2010-12-30 Vmware, Inc. Management of information technology risk using virtual infrastructures
CN102394774A (zh) * 2011-10-31 2012-03-28 广东电子工业研究院有限公司 云计算操作系统的控制器服务状态监控和故障恢复方法
CN102523257A (zh) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 一种基于iaas云平台的虚拟机容错方法
CN103037019A (zh) * 2013-01-07 2013-04-10 北京华胜天成科技股份有限公司 一种基于云计算的分布式数据采集系统及方法

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006057588A1 (en) * 2004-11-29 2006-06-01 Telefonaktiebolaget Lm Ericsson (Publ) Service alarm correlation
JP4318643B2 (ja) * 2002-12-26 2009-08-26 富士通株式会社 運用管理方法、運用管理装置および運用管理プログラム
KR100805820B1 (ko) * 2006-09-29 2008-02-21 한국전자통신연구원 센서 네트워크의 노드 장애 관리 방법 및 이를 위한 장애 발생 보고 장치
US7877760B2 (en) 2006-09-29 2011-01-25 Microsoft Corporation Distributed hardware state management in virtual machines
EP1976185B1 (en) * 2007-03-27 2019-05-01 Nokia Solutions and Networks GmbH & Co. KG Operating network entities in a communication system comprising a management network with agent and management levels
JP5083051B2 (ja) * 2008-06-06 2012-11-28 富士通株式会社 監視システム、監視装置、被監視装置、監視方法
JP5140633B2 (ja) * 2008-09-04 2013-02-06 株式会社日立製作所 仮想化環境において生じる障害の解析方法、管理サーバ、及びプログラム
JP5287402B2 (ja) * 2009-03-19 2013-09-11 富士通株式会社 ネットワーク監視制御装置
US8055933B2 (en) * 2009-07-21 2011-11-08 International Business Machines Corporation Dynamic updating of failover policies for increased application availability
US8122290B2 (en) * 2009-12-17 2012-02-21 Hewlett-Packard Development Company, L.P. Error log consolidation
JP5494298B2 (ja) 2010-07-06 2014-05-14 富士通株式会社 計算機装置,障害復旧制御プログラムおよび障害復旧制御方法
US8887006B2 (en) * 2011-04-04 2014-11-11 Microsoft Corporation Proactive failure handling in database services
US9262253B2 (en) * 2012-06-28 2016-02-16 Microsoft Technology Licensing, Llc Middlebox reliability
US9292376B2 (en) * 2012-08-24 2016-03-22 Vmware, Inc. Proactive resource reservation for protecting virtual machines
EP2936754B1 (en) * 2013-01-11 2020-12-02 Huawei Technologies Co., Ltd. Network function virtualization for a network device
US9973375B2 (en) * 2013-04-22 2018-05-15 Cisco Technology, Inc. App store portal providing point-and-click deployment of third-party virtualized network functions
US9350632B2 (en) * 2013-09-23 2016-05-24 Intel Corporation Detection and handling of virtual network appliance failures

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332889A1 (en) * 2009-06-25 2010-12-30 Vmware, Inc. Management of information technology risk using virtual infrastructures
CN102394774A (zh) * 2011-10-31 2012-03-28 广东电子工业研究院有限公司 云计算操作系统的控制器服务状态监控和故障恢复方法
CN102523257A (zh) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 一种基于iaas云平台的虚拟机容错方法
CN103037019A (zh) * 2013-01-07 2013-04-10 北京华胜天成科技股份有限公司 一种基于云计算的分布式数据采集系统及方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHIOSI, MARGARET ET AL.: "Network Functions Virtualisation-An Introduction, Benefits Enablers", CHALLENGE NETWORK FUNCTIONS VIRTUALISATION- INTRODUCTORY WHITE PAPER., 24 October 2012 (2012-10-24), XP055091626 *
See also references of EP3024174A4 *

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106464533A (zh) * 2015-04-09 2017-02-22 华为技术有限公司 基于网络功能虚拟化的故障处理方法和装置
US10608871B2 (en) 2015-04-09 2020-03-31 Huawei Technologies Co., Ltd. Network functions virtualization based fault processing method and apparatus
CN106301828A (zh) * 2015-05-21 2017-01-04 中兴通讯股份有限公司 一种虚拟化网络功能业务故障的处理方法及装置
WO2016184021A1 (zh) * 2015-05-21 2016-11-24 中兴通讯股份有限公司 一种虚拟化网络功能业务故障的处理方法及装置
CN106330501A (zh) * 2015-06-26 2017-01-11 中兴通讯股份有限公司 一种故障关联方法和装置
US10644952B2 (en) 2015-06-30 2020-05-05 Huawei Technologies Co., Ltd. VNF failover method and apparatus
US10701139B2 (en) 2015-07-20 2020-06-30 Huawei Technologies Co., Ltd. Life cycle management method and apparatus
WO2017012381A1 (zh) * 2015-07-20 2017-01-26 华为技术有限公司 一种生命周期管理方法及装置
CN108141375B (zh) * 2015-08-10 2021-10-29 诺基亚通信公司 云部署中的自动征兆数据收集
JP2018525929A (ja) * 2015-08-10 2018-09-06 ノキア ソリューションズ アンド ネットワークス オサケユキチュア クラウド展開における自動兆候データ収集
CN108141375A (zh) * 2015-08-10 2018-06-08 诺基亚通信公司 云部署中的自动征兆数据收集
CN106533714A (zh) * 2015-09-09 2017-03-22 中兴通讯股份有限公司 重新实例化虚拟网络功能的方法和装置
JP2018533280A (ja) * 2015-09-22 2018-11-08 華為技術有限公司Huawei Technologies Co.,Ltd. トラブルシューティング方法及び装置
US10601643B2 (en) 2015-09-22 2020-03-24 Huawei Technologies Co., Ltd. Troubleshooting method and apparatus using key performance indicator information
US11709737B2 (en) 2015-11-02 2023-07-25 Apple Inc. Restoring virtual network function (VNF) performance via VNF reset of lifecycle management
TWI743052B (zh) * 2015-11-02 2021-10-21 美商蘋果公司 透過生命週期管理之虛擬網路功能(vnf)重設以恢復vnf之性能
EP3726385A1 (en) * 2015-11-02 2020-10-21 Intel IP Corporation Restoring virtual network function (vnf) performance via vnf reset of lifecycle management
WO2017078790A1 (en) * 2015-11-02 2017-05-11 Intel IP Corporation Restoring virtual network function (vnf) performance via vnf reset of lifecycle management
TWI806198B (zh) * 2015-11-02 2023-06-21 美商蘋果公司 透過生命週期管理之虛擬網路功能(vnf)重設以恢復vnf之性能
EP4050486A1 (en) * 2015-11-02 2022-08-31 Apple Inc. Restoring virtual network function (vnf) performance via vnf reset of lifecycle management
US11340994B2 (en) 2015-11-02 2022-05-24 Apple Inc. Restoring virtual network function (VNF) performance via VNF reset of lifecycle management
US10901852B2 (en) 2015-11-02 2021-01-26 Apple Inc. Restoring virtual network function (VNF) performance via VNF reset of lifecycle management
CN106878096B (zh) * 2015-12-10 2019-12-06 中国电信股份有限公司 Vnf状态检测通告方法、装置以及系统
CN106878096A (zh) * 2015-12-10 2017-06-20 中国电信股份有限公司 Vnf状态检测通告方法、装置以及系统
EP3386170A4 (en) * 2015-12-31 2018-12-26 Huawei Technologies Co., Ltd. Fault processing method, device and system
US11032130B2 (en) 2015-12-31 2021-06-08 Huawei Technologies Co., Ltd. Troubleshooting method, apparatus, and system
WO2017157903A1 (en) * 2016-03-14 2017-09-21 Nokia Solutions And Networks Oy End-to-end virtualized network function healing
CN109074287A (zh) * 2016-05-04 2018-12-21 阿尔卡特朗讯公司 基础设施资源状态
CN109074287B (zh) * 2016-05-04 2023-06-27 阿尔卡特朗讯公司 基础设施资源状态
US10083098B1 (en) 2016-06-07 2018-09-25 Sprint Communications Company L.P. Network function virtualization (NFV) virtual network function (VNF) crash recovery
JP2018025968A (ja) * 2016-08-10 2018-02-15 日本電信電話株式会社 復旧制御システム及び方法
CN109565446B (zh) * 2016-08-31 2020-12-25 华为技术有限公司 一种告警信息上报方法及装置
US10735253B2 (en) 2016-08-31 2020-08-04 Huawei Technologies Co., Ltd. Alarm information reporting method and apparatus
CN109565446A (zh) * 2016-08-31 2019-04-02 华为技术有限公司 一种告警信息上报方法及装置
WO2018040042A1 (zh) * 2016-08-31 2018-03-08 华为技术有限公司 一种告警信息上报方法及装置
CN107623596A (zh) * 2017-09-15 2018-01-23 郑州云海信息技术有限公司 一种nfv平台中启动测试网元定位排查故障的方法
CN109995568B (zh) * 2018-01-02 2022-03-29 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质
CN109995568A (zh) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质
CN109995569B (zh) * 2018-01-02 2022-06-03 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质
CN109995569A (zh) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质
CN110601905A (zh) * 2019-09-29 2019-12-20 苏州浪潮智能科技有限公司 一种故障检测方法和装置

Also Published As

Publication number Publication date
RU2016117218A (ru) 2017-11-14
EP3024174A1 (en) 2016-05-25
JP2016533655A (ja) 2016-10-27
JP6212207B2 (ja) 2017-10-11
BR112016006902B1 (pt) 2022-10-04
EP3024174A4 (en) 2016-08-17
BR112016006902A2 (zh) 2017-09-19
CN104685830B (zh) 2018-03-06
KR101908465B1 (ko) 2018-12-10
CN108418711B (zh) 2021-05-18
CN108418711A (zh) 2018-08-17
US20160224409A1 (en) 2016-08-04
EP3322125B1 (en) 2019-11-13
EP3322125A1 (en) 2018-05-16
CN104685830A (zh) 2015-06-03
EP3024174B1 (en) 2017-11-22
US10073729B2 (en) 2018-09-11
KR20160060741A (ko) 2016-05-30
RU2644146C2 (ru) 2018-02-07

Similar Documents

Publication Publication Date Title
WO2015042937A1 (zh) 故障管理的方法、实体和系统
US11729044B2 (en) Service resiliency using a recovery controller
JP6443895B2 (ja) 障害管理方法、仮想化ネットワーク機能マネージャ(vnfm)、及びプログラム
US11321197B2 (en) File service auto-remediation in storage systems
CN108039964B (zh) 基于网络功能虚拟化的故障处理方法及装置、系统
US9652326B1 (en) Instance migration for rapid recovery from correlated failures
US8615676B2 (en) Providing first field data capture in a virtual input/output server (VIOS) cluster environment with cluster-aware vioses
US9450700B1 (en) Efficient network fleet monitoring
WO2017114325A1 (zh) 故障处理方法、装置及系统
US9489230B1 (en) Handling of virtual machine migration while performing clustering operations
WO2017050130A1 (zh) 一种故障恢复方法及装置
CN106936616B (zh) 备份通信方法和装置
US20120066678A1 (en) Cluster-aware virtual input/output server
US11706080B2 (en) Providing dynamic serviceability for software-defined data centers
WO2015109955A1 (zh) 电信云中异常事件的处理方法及装置
US20070174723A1 (en) Sub-second, zero-packet loss adapter failover
JP5642725B2 (ja) 性能分析装置、性能分析方法及び性能分析プログラム
WO2017041671A1 (zh) 故障恢复的方法和装置
WO2017107014A1 (zh) 一种网络亚健康诊断方法及装置
WO2012164418A1 (en) Facilitating processing in a communications environment using stop signaling
WO2022218346A1 (zh) 一种故障处理方法及装置
RU2672184C1 (ru) Способ, устройство и система управления обработкой отказов
JP4945774B2 (ja) ディスクアレイ装置およびトランスポート制御用プロセッサコアの障害情報データ採取方法
WO2024067051A1 (zh) 一种多az仲裁系统及方法
CN105515667A (zh) 一种高可用性计算机系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13894185

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2013894185

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013894185

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016517300

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112016006902

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 20167010730

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2016117218

Country of ref document: RU

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112016006902

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20160329