WO2016145653A1 - 基于网络功能虚拟化的故障处理方法及设备 - Google Patents

基于网络功能虚拟化的故障处理方法及设备 Download PDF

Info

Publication number
WO2016145653A1
WO2016145653A1 PCT/CN2015/074580 CN2015074580W WO2016145653A1 WO 2016145653 A1 WO2016145653 A1 WO 2016145653A1 CN 2015074580 W CN2015074580 W CN 2015074580W WO 2016145653 A1 WO2016145653 A1 WO 2016145653A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
processing
information
policy
function
Prior art date
Application number
PCT/CN2015/074580
Other languages
English (en)
French (fr)
Inventor
刘建宁
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/074580 priority Critical patent/WO2016145653A1/zh
Priority to CN201580035071.8A priority patent/CN106464541B/zh
Publication of WO2016145653A1 publication Critical patent/WO2016145653A1/zh
Priority to US15/708,388 priority patent/US10565047B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0712Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0681Configuration of triggering conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0894Policy-based network configuration management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/40Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a fault processing method and device based on Network Function Virtualization (NFV).
  • NFV Network Function Virtualization
  • Network Function Virtualization is initiated by 13 major telecom operators around the world. It is an organization of many equipment vendors and information technology (IT) vendors to define the virtualization of carrier network functions. The needs and related technical reports, hope to learn from the IT virtualization technology, the use of general-purpose high-performance large-capacity servers, switches and storage to achieve the software of some network functions.
  • IT virtualization technology the use of general-purpose high-performance large-capacity servers, switches and storage to achieve the software of some network functions.
  • various types of network devices such as servers, routers, storage devices, content delivery networks (CDNs), switches, etc.
  • CDNs content delivery networks
  • switches etc.
  • the existing NFV network architecture includes: Network Function Virtualization Infrastructure (NFVI), Virtualized Network Function (VNF), Element Management (EM), and Virtual Network Function Management.
  • NFVI Network Function Virtualization Infrastructure
  • VNF Virtualized Network Function
  • EM Element Management
  • Virtual Network Function Management VNF Manager
  • VNFM Virtual Infrastructure Manager
  • NFVO NFV Orchestrator
  • OSS Operations Support System
  • BSS Business Support System
  • the VNF runs on the NFVI
  • one EM corresponds to one or more VNFs.
  • a VNF usually includes a primary VNF and a standby VNF.
  • the primary VNF When the service transmission starts, the primary VNF is adopted, and when the primary VNF fails, the primary VNF is used.
  • the service on the switch is switched to the standby VNF to ensure the normal operation of the service.
  • the primary VNF may also be a fault caused by other causes, such as a fault caused by other VNFs associated with the primary VNF, so when the primary VNF is not caused by its own cause, even if the standby VNF is started instead And can't eliminate the fault from the root, but waste Valuable resources increase the business processing burden of the management entity.
  • the VNF fault association method mainly includes: the VNF entity sends the fault information to the VNFM, wherein the VNFM associates the VNF with other VNFs in the management scope thereof. If the cause of the root cause failure is found, the VNFM is triggered to recover the fault. If the root cause fails, the fault information is sent to the NFVO by the VNFM, and the root fault of the entire network service (Network Service, NS for short) is performed by the NFVO. Processing, if necessary, can further send the fault information to the OSS, and the OSS further performs fault correlation. Finally, the fault correlation result is returned to the VNFM, and the fault recovery is performed by the VNFM.
  • multiple function nodes may be required to perform fault association, and the required time is long. If the carried service has strict requirements on the delay, the fault association may seriously affect the continuity of the service. It even leads to business disruption and affects the user experience.
  • the present invention provides a fault processing method and device based on network function virtualization to ensure that services are not interrupted during fault processing and improve user experience.
  • a first aspect of the present invention provides a fault processing method based on network function virtualization, which may include:
  • the first function management entity acquires fault information of the functional entity
  • the first function management entity triggers a fault correlation process according to the fault information, and formulates a fault processing policy according to the fault association processing result;
  • the first function management entity processes the fault according to the fault processing strategy; if the fault processing strategy is not formulated, the first function management entity is based on the The fault handling policy is used to process the fault, and the preset fault handling strategy is a policy formulated for the fault of the functional entity itself.
  • the triggering fault association processing and the fault processing strategy according to the fault association processing result include: acquiring, by the first function management entity, fault information of other functional entities in the management scope, The function entity performs local fault association processing with other functional entities in the management scope to obtain local fault association information. The first function management entity formulates the fault processing strategy according to the local fault association information.
  • the triggering fault association processing and the fault processing strategy according to the fault association processing result include: acquiring, by the first function management entity, fault information of other functional entities in the management scope, The function entity performs local fault association processing with other functional entities in the management scope to obtain local fault association information; the first function management entity sends the fault information and the local fault association information to the second function management entity to enable the foregoing
  • the second function management entity acquires fault information of the network function virtualization infrastructure NFVI and fault information of other functional entities under the network service NS, and performs external fault correlation processing on the functional entity, other faulty functional entities under the NS, and the NFVI.
  • the first function management entity receives the foregoing external fault association information sent by the second function management entity, and formulates the fault processing policy according to the external fault association information.
  • the foregoing fault processing is formulated according to the foregoing external fault association information.
  • the policy includes: the foregoing first function management entity formulating the foregoing fault processing policy according to the preset fault handling policy for processing the fault and the external fault association information.
  • the triggering fault association processing and the fault processing strategy according to the fault association processing result include: the triggering fault association processing and the fault processing strategy according to the fault association processing result, including: The first function management entity acquires fault information of other functional entities in the management scope, and performs local fault association processing on the functional entity and other functional entities in the management scope to obtain local fault association information; the first function management entity performs the second function.
  • the management entity sends the foregoing fault information and the local fault association information, so that the second function management acquires the fault information of the network function virtualization infrastructure NFVI and the fault information of other functional entities under the network service NS, and the foregoing functional entity, the foregoing NS
  • the other faulty functional entity and the NFVI perform external fault correlation processing to obtain external fault association information, and formulate the fault processing strategy based on the external fault association information; the first function management entity receives the The second function of the above fault management entity sends processing strategy.
  • the foregoing fault processing is established based on the foregoing external fault association information.
  • the policy includes: the first function management entity sending a timeout notification message to the second function management entity, where the timeout notification message carries a preset for processing the fault.
  • the fault handling policy is configured to enable the second function management entity to formulate the fault processing policy according to the preset fault handling policy and the external fault association information.
  • the method further includes: if After the fault handling policy is completed, the first function management entity determines whether the fault processing policy is the same as the preset fault processing policy; if not, the first function management entity further processes the fault according to the fault processing policy.
  • a second aspect of the present invention provides a function manager, which may include:
  • An obtaining unit configured to acquire fault information of the functional entity
  • a processing unit configured to trigger a fault correlation process according to the fault information acquired by the acquiring unit, and formulate a fault processing policy according to the fault association processing result, and if the fault processing policy is completed, the fault processing is performed according to the fault processing
  • the policy processes the fault. If the fault handling policy is not specified, the fault is processed according to the preset fault handling policy.
  • the preset fault handling policy is a policy formulated for the fault of the functional entity itself.
  • the acquiring unit is specifically configured to obtain fault information of other functional entities in the management scope;
  • the processing unit is specifically configured to: according to the fault information and other functions in the management scope
  • the fault information of the entity is subjected to local fault association processing with the other functional entities in the management scope to obtain local fault association information; and the fault handling strategy is formulated according to the local fault association information.
  • the acquiring unit is specifically configured to obtain fault information of other functional entities in the management scope; and the processing unit is specifically configured to: perform the foregoing functional entities and other functions in the management scope.
  • the entity performs local fault correlation processing to obtain local fault association information; and sends the fault information and the local fault association information to the second function management entity, so that the second function management entity acquires fault information of the network function virtualization infrastructure NFVI and
  • the fault information of the other functional entities in the network service NS performs external fault correlation processing on the functional entity, the other faulty functional entity under the NS, and the NFVI, to obtain external fault association information; and receives the second function management entity to send the External fault related information, and based on the above external
  • the fault association information is formulated into a fault handling strategy.
  • the processing unit is specifically configured to: when the fault processing time arrives, and the fault processing strategy is not determined, the fault is processed according to the fault.
  • the preset fault handling strategy and the above external fault association information formulate the above fault handling strategy.
  • the acquiring unit is specifically configured to acquire fault information of other functional entities in the management scope, where the processing unit is specifically configured to: The function entity performs local fault correlation processing to obtain local fault association information; and sends the fault information and the local fault association information to the second function management entity, so that the second function management entity acquires the fault information of the NFVI and the network service NS.
  • the fault information of the functional entity is subjected to external fault correlation processing to the functional entity, the other faulty functional entity under the NS, and the NFVI, to obtain external fault association information, and the fault processing strategy is formulated based on the external fault association information;
  • the fault handling policy sent by the second function management entity.
  • the processing unit is specifically configured to: when the fault processing time arrives, and the fault processing strategy is not completed, go to the second
  • the function management entity sends a timeout notification message, and the timeout notification message carries a preset fault handling policy for processing the fault, so that the second function management entity formulates the fault processing policy according to the preset fault handling policy and the external fault association information.
  • the foregoing function manager further includes: a determining unit, configured to process the policy according to the preset fault After the fault is processed, if the fault processing strategy is completed, it is determined whether the fault processing strategy is the same as the preset fault processing strategy; and the processing unit is further configured to: if the determining unit determines the fault processing policy and the preset fault The processing strategy is different, and the above fault is also processed according to the above fault handling strategy.
  • a third aspect of the present invention provides a function manager, including: at least one processor and at least one memory, wherein the processor is connected to the memory through at least one bus;
  • the processor is configured to acquire fault information of the functional entity, trigger fault association processing according to the fault information, and formulate a fault processing strategy according to the fault association processing result; when the fault processing time arrives, if the fault processing strategy is completed, according to the foregoing
  • the fault handling strategy is used to process the fault; if the fault handling strategy is not completed, the fault is processed according to the preset fault handling strategy, and the preset fault processing strategy is a policy formulated for the fault of the functional entity itself;
  • the above memory is used to store the foregoing preset fault handling strategy and the above-mentioned fault handling strategy.
  • the foregoing processor is specifically configured to obtain fault information of other functional entities in the management scope, and perform local fault association processing on the functional entity and other functional entities in the management scope.
  • the local fault association information is obtained; and the fault handling strategy is formulated according to the local fault association information.
  • the function manager is specifically configured to obtain fault information of other functional entities in the management scope, and perform local fault association between the functional entity and other functional entities in the management scope. Processing, obtaining local fault association information; sending the fault information and the local fault association information to the second function management entity, so that the second function management entity acquires fault information of the network function virtualization infrastructure NFVI and other network service NS
  • the fault information of the functional entity is subjected to external fault correlation processing to the functional entity, the other faulty functional entity under the NS, and the NFVI, to obtain external fault association information, and the foregoing external fault association information sent by the second function management entity is received. And formulating the above fault handling strategy according to the above external fault related information.
  • the function manager is specifically configured to: create the foregoing fault according to the preset fault processing policy for processing the fault and the external fault association information. Processing strategy.
  • the function manager is specifically configured to acquire fault information of other functional entities in the management scope, and perform local fault association between the functional entity and other functional entities in the management scope. Processing, obtaining local fault association information; sending the fault information and the local fault association information to the second function management entity, so that the second function management acquires fault information of the network function virtualization infrastructure NFVI and other functions of the network service NS.
  • the fault information of the entity is subjected to external fault correlation processing on the functional entity, the other faulty functional entity under the NS, and the NFVI, to obtain external fault correlation information, and based on the foregoing external
  • the fault association information is used to formulate the foregoing fault processing strategy, and the foregoing fault processing strategy sent by the second function management entity is received.
  • the function manager is specifically configured to send a timeout notification message to the second function management entity, where the timeout notification message carries The preset fault handling policy of the fault is processed, so that the second function management entity formulates the fault processing policy according to the preset fault processing policy and the external fault association information.
  • the foregoing processor is further configured to: if the fault processing strategy is completed, determine the fault processing strategy Whether it is the same as the preset fault handling strategy described above; if not, the fault is also processed according to the above fault handling strategy.
  • the first function management entity after receiving the fault information of the functional entity, the first function management entity triggers the fault association processing and formulates the fault processing policy according to the fault association processing result.
  • the fault processing time is reached, if the fault handling policy is completed, the fault is processed according to the fault handling policy. If the fault processing strategy is not completed, the fault is processed according to the preset fault processing strategy, where the fault processing strategy is preset. It is only a policy formulated for a fault caused by the function entity itself. Therefore, in the present invention, the fault processing time is set to ensure that the service is not interrupted during the fault handling process, thereby improving the user experience.
  • FIG. 1 is a schematic structural diagram of an NFV network according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a fault processing method based on network function virtualization according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a fault processing method based on network function virtualization according to another embodiment of the present invention. Schematic diagram of the process;
  • FIG. 4 is a schematic flowchart of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • FIG. 6 is a schematic flowchart diagram of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a function manager according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a function manager according to another embodiment of the present invention.
  • the network function virtualization-based fault processing method provided by the embodiment of the invention ensures that the service is not interrupted during the fault processing process, thereby improving the user experience.
  • the embodiment of the present invention further provides a function manager.
  • BSS/Operation Support System OSS initiate service request to NFVO, and service office The resources needed are responsible for troubleshooting.
  • the network function virtualization orchestrator NFVO receives the service request sent by the BSS/OSS, allocates management resources, and monitors VNF, NFVI resources and running status information, for example, fault information in real time.
  • the virtual network function manager VNFM is responsible for VNF generation cycle management, such as startup, lifetime, VNF operational status information, such as VNF fault information.
  • the Virtual Infrastructure Manager VIM is responsible for managing and allocating NFVI resources and monitoring and collecting NFVI operational status information such as fault information.
  • the NE management EM is responsible for the management of the VNF, including performance monitoring and service configuration of the VNF.
  • VNF virtual network function
  • MME Mobility Management Entity
  • PGW Virtual Packet Data Network Gateway
  • vSwitch Virtual Network Switch
  • Firewall etc.
  • the NS catalog of the network service stores all uploaded NSs and supports the creation and management of deployment templates such as Network Service Descriptor (NSD), VLD, and VNFFGD.
  • Virtual network function VNF catalog stores all uploaded VNF packages, supports virtual network function descriptor (VNF Descriptor, VNFD for short), software image, other list creation and management
  • the network function virtualizes NFV instances, storing information about all VNF entities and NS entities.
  • Network Function Virtualization Infrastructure NFVI resources which store NFVI available/reserved/allocated resource information.
  • VNF lifecycle management VNF lifecycle management, interactive configuration information
  • the VNF lifecycle management requests resources, sends configuration information, and collects state information
  • Nf-Vi resource specific allocation, virtual resource status information interaction, hardware resource configuration
  • Os-Ma VNF life cycle management
  • service graph life cycle management service graph life cycle management
  • policy management etc.
  • Vn-Nf used by NFVI to provide the actual execution environment to VNF.
  • a fault processing method based on network function virtualization may include: acquiring a functional entity by a first function management entity (eg, VNFM) The fault information of the VNF is triggered by the first function management entity, and the fault correlation process is triggered according to the fault information, and the fault processing strategy is formulated according to the fault correlation processing result; when the fault processing time arrives, if the fault handling strategy is completed The first function management entity processes the fault according to the fault handling policy; if the fault handling policy is not formulated, the first function management entity processes the fault according to the preset fault handling policy, and the preset fault processing is performed.
  • a policy is a policy that is developed for the failure of the above-mentioned functional entity itself.
  • FIG. 2 is a schematic flowchart diagram of a fault processing method based on network function virtualization according to an embodiment of the present invention.
  • the fault processing method based on network function virtualization provided by an embodiment of the present invention may include:
  • the first function management entity acquires fault information of the function entity.
  • the fault information of the function entity acquired by the first function management entity may be obtained in multiple manners, for example, the fault information sent by the function entity may be directly received, or the fault information about the function entity sent by another function management entity may be received, and other function management is performed.
  • the entity has the ability to monitor functional entity failures.
  • the fault information includes at least: an identifier (ID), a fault type, and fault data.
  • ID an identifier
  • fault type a fault type
  • fault data a fault data
  • the first function management entity may be a VNFM and the functional entity is a VNF.
  • the first function management entity triggers a fault correlation process according to the fault information, and formulates a fault processing policy according to the fault association processing result.
  • the first function management entity triggers the fault association process and triggers the formulation of the fault handling policy.
  • the first function management entity may perform fault correlation processing and formulate a fault handling strategy.
  • it may also be managed by other functions.
  • the entity performs fault correlation processing and formulates a fault handling strategy, or the first function management entity cooperates with other function management entities to perform fault correlation processing and formulate a fault handling strategy.
  • the first The function management entity processes the fault according to the foregoing fault handling policy; if the fault handling policy is not completed, the first function management entity processes the fault according to the preset fault handling policy, and the preset fault processing strategy is A strategy developed by a functional entity to cause a failure.
  • the fault handling time of the present invention is determined based on the minimum requirements of traffic continuity for latency to ensure that traffic is not interrupted during the fault handling time.
  • Whether the fault processing time is reached can be represented by a timing method.
  • a timer can be set in the first function management entity, and after receiving the fault information, triggering a fault processing strategy, and starting the timer to start timing, and then passing Determine whether the timer's timing value meets the fault processing time to determine whether the fault processing time has been reached.
  • a timer is set in another function management entity, and the other function management entity has a function of monitoring a fault of the functional entity. When the fault of the functional entity is detected, the start timer starts counting, and the timer value of the timer satisfies the fault processing. The timeout notification is sent to the first function management entity, and after receiving the timeout notification, the first function management entity executes the corresponding failure policy by determining whether the fault handling policy is completed.
  • the preset fault handling strategy is mainly a strategy formulated for faults generated by the functional entity itself.
  • the preset fault processing policy is pre-stored in the first function management entity, and may be stored in a list manner, and the list includes at least: a fault ID, a fault type, and a preset fault handling policy, as shown in Table 1:
  • a matching preset fault handling strategy is found from the list by the fault ID and/or the fault type included in the fault information.
  • the fault handling policy may also be stored in a list manner.
  • the fault processing policy is a policy that is formulated according to the fault association processing result after the fault association is performed. It is primarily a strategy for association failures caused by other functional entities, NFVI, hardware, and the like. As shown in Table 2 below:
  • the first function management entity acquires the functional entity, After the fault information is triggered, the fault association process is triggered and the fault correlation processing result is obtained. If the fault handling strategy is completed during the fault processing time, the fault handling strategy is used to perform fault processing to implement comprehensive fault handling and solve the fault problem from the root cause. If the fault handling policy is not specified, the fault is processed by using the corresponding preset fault handling policy for the fault caused by the fault of the functional entity to meet the minimum requirements of the service continuity delay, thereby ensuring that the service is in the fault handling process. Not interrupted, improving the user experience.
  • the first function management entity triggering the formulating the fault handling policy may include: the first function management entity acquires fault information of other functional entities in the management scope, and the functional entity and the management scope The other functional entity performs local fault correlation processing to obtain local fault association information. The first function management entity formulates the fault processing strategy according to the local fault association information.
  • the first function management entity manages a plurality of functional entities, and may receive fault information sent by multiple functional entities at the same time.
  • the faulty association is performed on the functional entity, it is determined whether other functional entities in the management scope also fail. If yes, obtain fault information of other functional entities, and then perform local fault correlation processing on the functional entity and other functional entities that also fail according to all the fault information, wherein if the local fault correlation processing is found, it is generated by other functional entities. If the fault is caused by a fault, the fault handling strategy can be formulated according to the processed local fault association information.
  • the fault association is performed between a certain functional entity and other functional entities within its management scope. If it is determined that the fault is caused by other functional entities, it indicates that the other functional entities are themselves. The fault is caused. At this time, after the fault of the other functional entity is processed by the preset fault handling strategy, the functional entity returns to normal, that is, the functional entity may not need to be fault processed.
  • the functional entity A and the functional entity B perform fault association, and the processing is caused by the fault of the functional entity B, causing the functional entity A to cause a fault, and adopting the preset fault processing strategy for the functional entity. After B performs troubleshooting, functional entity A returns to normal.
  • the triggering the fault handling policy may include: the first function management entity acquires fault information of other functional entities in the management scope, and the functional entity and other functional entities in the management scope. Perform local fault correlation processing to get local fault shutdown
  • the first function management entity sends the fault information and the local fault association information to the second function management entity, so that the second function management acquires the fault information of the NFVI and the fault information of other functional entities under the NS, and the foregoing function
  • the entity, the other faulty functional entity under the NS, and the NFVI perform external fault correlation processing to obtain external fault association information;
  • the first function management entity receives the external fault association information sent by the second function management entity, and according to the foregoing
  • the external fault association information is used to formulate a fault handling strategy.
  • the first function management entity performs local fault association processing on the functional entity and other functional entities that have failed in the management scope, and then sends the fault information together with the local fault association information to the second function management entity, and the second function management entity
  • the entity is the upper layer function management entity of the first function management entity, and can perform fault association on all faulty functional entities under the NS and the NFVI.
  • the second function management entity performs fault correlation processing on the functional entity indicated by the fault information, other faulty functional entities under the NS, and the NFVI. If the root fault is still not analyzed, the fault information and the local fault information are further The fault association information and the NS fault association information are sent to the third function management entity, and the third function management entity processes the function entity and the hardware to obtain the external fault association information, and sends the external fault association information to the second function management entity. .
  • the final fault processing strategy is also formulated with reference to the preset fault processing strategy for processing the fault when the fault processing time arrives, that is, the last A fault handling strategy is formulated based on fault information, local fault association information, NS fault association information, and hardware fault information.
  • the first function management entity formulates a fault handling policy.
  • the second function management entity may formulate a fault handling policy, and then send the fault handling policy to the first function management entity.
  • the triggering the fault handling policy may include: the first function management entity acquires fault information of other functional entities in the management scope, and performs the function entity and other functional entities in the management scope.
  • the local fault association process is performed to obtain the local fault association information.
  • the first function management entity sends the fault information and the local fault association information to the second function management entity, so that the second function management entity obtains the fault information of the NFVI and the other NS.
  • the fault information of the functional entity is subjected to external fault correlation processing on the functional entity, the other faulty functional entity under the NS, and the NFVI, to obtain external fault association information, and based on The external fault association information is used to formulate a fault handling policy; the first function management entity receives the fault handling policy sent by the second function management entity.
  • the second function management entity performs fault correlation processing on the functional entity indicated by the fault information, other faulty functional entities under the NS, and the NFVI. If the root fault is still not analyzed, the fault is further The information, the local fault association information, and the NS fault association information are sent to the third function management entity, and the third function management entity processes the function entity or the hardware to obtain the external fault association information, and formulates the fault according to the external fault association information. Processing the policy, and sending the foregoing fault processing policy to the second function management entity.
  • the first function management entity further sends a timeout notification message to the second function management entity, where the timeout notification message carries a pre-processing for processing the fault.
  • the fault handling policy is configured to enable the second function management entity to formulate a fault handling policy according to the preset fault handling policy and the external fault association information.
  • the final fault processing strategy is also formulated with reference to the preset fault processing strategy for processing the fault when the fault processing time arrives, that is, the last A fault handling strategy is formulated based on fault information, local fault association information, NS fault association information, and hardware fault information.
  • the method further includes: if the fault handling policy is completed, the first function management entity determines the fault Whether the processing policy is the same as the preset fault handling policy; if not, the first function management entity further processes the fault according to the fault handling policy.
  • the fault processing time arrives if the fault handling strategy has not been formulated, the fault processing is first performed according to the preset fault handling strategy to meet the minimum delay requirement of the business continuity, so as to prevent the fault processing from being exceeded. The delay has not been processed, resulting in business interruption. At the same time, the process of formulating the fault handling strategy is still in progress. After the fault handling strategy is formulated, if the fault handling strategy is different from the preset fault handling strategy when the fault handling is performed, the root cause of the fault has not been solved. The fault needs to be further processed according to the fault handling strategy to ensure that the fault can be resolved from the root cause to fully handle the fault and ensure the normal operation of the service.
  • FIG. 3 is a schematic flowchart diagram of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • the fault processing method based on the network function virtualization in FIG. 3 sets a timer in the VNFM, and the fault processing strategy is formulated by the VNFO.
  • the fault processing method based on the virtualization of the network function may include:
  • the VNF sends the fault information of the VNF to the VNFM, where the fault information includes at least a fault ID, a fault type, and fault data.
  • the VNF interacts with the VNFM through the above-mentioned interface Ve-Vnfm.
  • the VNFM receives the fault information of the VNF, and starts a timer.
  • a timer is set in the VNFM, and when the VNFM acquires the failure information, the timer is started to start timing, and the following steps 303 and 304 are started.
  • steps 303 and 304 start execution at the same time, but not necessarily simultaneously.
  • the VNFM searches for a corresponding preset fault processing policy according to the foregoing fault information.
  • Table 1 can be specifically as shown in Table 3 below:
  • ID2 in the fault information find the content of the preset fault handling policy in the list: create a new VNF; migrate the service to the new VNF. That is, a new VNF is created to replace the failed VNF.
  • the VNFM obtains fault information of other VNFs, and performs local fault correlation between the VNF and other VNFs to obtain local fault association information.
  • step 304 to step 309 and step 311 is a process of formulating a fault handling strategy.
  • the VNFM sends the VNF fault information and the local fault association information to the VNFO.
  • the VNFM interacts with the VNFO through the Or-Vnfm interface described above.
  • the VNFO obtains at least one NFVI fault information and other VNF fault information according to the fault information, local fault association information, other VNF fault information, and NFVI fault information, and other faulty VNFs under the VNF and NS. And NFVI performs fault correlation processing to obtain NS association information;
  • VNFO can monitor the running status of VNF and NFVI in real time, therefore, VNFO can be related to the specific NS level.
  • the VNFO sends the fault information, the local fault association information, and the NS association information to the OSS/BSS.
  • the OSS/BSS obtains hardware fault information, performs hardware association processing, and obtains external fault association information.
  • the OSS/BSS sends an external fault association information to the VNFO.
  • the VNFM sends a timeout notification message to the VNFO, where the timeout notification message includes a preset fault handling policy, and processes the fault according to the preset fault handling policy.
  • step 302 starts the timer to start timing, and at the same time, starts the fault processing strategy from step 304, and when the step 309 is executed, the timer count value is detected as the fault processing time, but at this time, the fault handling strategy is also It has not been finalized. Therefore, the fault needs to be processed according to the preset fault handling strategy.
  • the value of the timer counted as the fault processing time means that, for example, the fault processing time is set to 3 milliseconds, and the timer starts from 0, and when the counted value is just 3 milliseconds, it indicates that the timer is timed.
  • the value is the fault processing time.
  • the VNFO formulates a fault handling strategy according to the external fault association information and the preset fault handling strategy.
  • step 310 the value of the timer is already the fault processing time, and the fault is first processed according to the preset fault processing strategy.
  • the fault processing strategy is further developed based on step 309, and in step 311, Completed the development of the troubleshooting strategy.
  • the VNFO sends the foregoing fault handling strategy to the VNFM.
  • the VNFM processes the fault according to the foregoing fault handling strategy.
  • the value of the timer is the fault processing time, but the fault handling strategy has not been formulated yet. Therefore, the fault is first processed according to the preset fault handling strategy to ensure that the service is not caused by an excessive fault. The processing time is interrupted. After the fault handling policy is developed, the fault is further processed according to the fault handling strategy to ensure that the fault is fully processed and the fault is solved from the root cause.
  • FIG. 4 is a schematic flowchart diagram of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • the fault processing method based on the network function virtualization in FIG. 4 sets a timer in the VNFM, and the fault processing strategy is formulated by the VNFM, and the fault processing method based on the virtualization of the network function may include:
  • Steps 401 to 409 are the same as steps 301-309 described above, and are not described herein again.
  • the VNFO sends external fault association information to the VNFM.
  • the VNFM processes the fault according to the preset fault handling strategy.
  • step 402 starts the timer to start timing, and at the same time, starting from step 404 to start the fault processing strategy, and when step 409 is executed, the value of the timer is detected as the fault processing time, but at this time, the fault handling strategy is also It has not been finalized. Therefore, the fault needs to be processed according to the preset fault handling strategy.
  • the value of the timer counted as the fault processing time means that, for example, the fault processing time is set to 3 milliseconds, and the timer starts from 0, and when the counted value is just 3 milliseconds, it indicates that the timer is timed.
  • the value is the fault processing time.
  • the VNFM formulates a fault handling policy according to the external fault association information and the preset fault handling strategy.
  • the fault handling strategy has not been formulated, and the fault handling strategy has been performed according to the preset fault handling strategy.
  • the preset fault handling strategy that has been used to resolve the fault in one step.
  • the VNFM processes the fault according to the fault handling policy.
  • the fault handling strategy is formulated by the VNFM.
  • FIG. 5 is a schematic flowchart diagram of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • the fault processing method based on the network function virtualization in FIG. 5 sets a timer in the EM, and the fault processing strategy is formulated by the VNFO.
  • the fault processing method based on the virtualization of the network function may include:
  • the VNF sends a fault information of the VNF to the VNFM, where the fault message includes at least a fault ID, a fault type, and fault data.
  • the EM detects whether the VNF is faulty, or the VNF sends a fault information of the VNF to the EM, and starts a timer.
  • steps 303-309 are the same as steps 303-309 in the embodiment shown in FIG. 3, and are not described here;
  • the EM detects that the value of the timer is the fault processing time, and sends a timeout notification message to the VNFM.
  • the VNFM receives the timeout notification message, and processes the fault according to the preset fault handling policy.
  • the VNFM sends a timeout notification message to the VNFO, where the timeout notification message carries a preset fault handling policy.
  • the above 511 and 512 can be performed simultaneously.
  • the VNFO formulates a fault handling strategy according to the preset fault handling policy and the external fault association information.
  • the VNFO sends a fault handling strategy to the VNFM.
  • the VNFM processes the fault according to a fault handling strategy.
  • a timer is set in the EM, and the fault handling strategy is formulated by the VNFO.
  • the EM detects that the timer count value meets the fault processing time, it sends a timeout notification message to the VNFM.
  • the VNFM first processes the fault according to the preset fault handling strategy to ensure the service. The interrupt will not be interrupted due to the long processing time.
  • the VNFM further processes the fault according to the fault handling strategy to ensure that the fault is fully processed and the fault is solved from the root cause.
  • FIG. 6 is a schematic flowchart diagram of a fault processing method based on network function virtualization according to another embodiment of the present invention.
  • a timer is set in the EM, and a fault handling policy is formulated by the VNFM, and a fault processing method based on the network function virtualization may include:
  • the VNF sends a fault information of the VNF to the VNFM, where the fault message includes at least a fault ID, a fault type, and fault data.
  • the EM After detecting that the VNF is faulty or receiving the fault information sent by the VNF, the EM starts a timer.
  • 603 to 609 are the same as steps 503 to 509 in the embodiment shown in FIG. 5 above, and are not described herein again;
  • the VNFO sends an external fault association information to the VNFM.
  • the EM detects that the value of the timer is the fault processing time, and sends a timeout notification message to the VNFM.
  • the VNFM receives the timeout notification message, and executes a preset fault handling policy.
  • the VNFM formulates a fault handling policy according to the preset fault handling policy and the external fault association information.
  • the VNFM further processes the fault according to the fault handling strategy.
  • the fault handling strategy is formulated by the VNFM.
  • FIG. 7 is a schematic structural diagram of a function manager according to an embodiment of the present invention. As shown in FIG. 7, a function manager 700 may include:
  • the obtaining unit 710 is configured to acquire fault information of the functional entity
  • the processing unit 720 is configured to trigger a fault correlation process according to the fault information acquired by the communication unit, and formulate a fault processing policy according to the fault association processing result, and if the fault processing policy is completed, according to the fault
  • the processing strategy is used to process the fault. If the fault handling policy is not specified, the fault is processed according to the preset fault handling policy.
  • the preset fault handling policy is a policy formulated for the fault of the functional entity itself.
  • the obtaining unit 710 acquires fault information of the functional entity, and the processing unit 720 triggers the fault association processing according to the fault information and formulates the fault processing strategy according to the fault association processing result.
  • the processing unit 720 processes the fault according to the preset fault handling policy.
  • the preset fault processing policy is only a policy for the fault caused by the functional entity itself. Therefore, In the present invention, the fault processing time is set to ensure that the service is not interrupted during the fault handling process, thereby improving the user experience.
  • the obtaining unit 710 is specifically configured to acquire fault information of other functional entities in the management scope.
  • the processing unit 720 is specifically configured to: according to the fault information and faults of other functional entities in the management scope.
  • the information is used to perform local fault association processing on the functional entity and other functional entities in the management scope to obtain local fault association information; and formulate a fault processing strategy according to the local fault association information.
  • the obtaining unit 710 is specifically configured to acquire fault information of other functional entities in the management scope.
  • the processing unit 720 is specifically configured to: according to the fault information and other functional entities in the management scope. Fault information, performing local fault correlation processing on the functional entity and other functional entities in the management scope to obtain local fault association information; and sending the fault information and local fault association information to the second function management entity, so as to enable the second function management
  • the entity obtains the fault information of the network function virtualization infrastructure NFVI and the fault information of other functional entities under the network service NS, and performs external fault correlation processing on the above functional entity, other faulty functional entities under the NS, and the NFVI to obtain an external
  • the fault association information is received; the external fault association information sent by the second function management entity is received, and the fault handling policy is formulated according to the external fault association information.
  • the processing unit 720 is specifically configured to formulate a fault processing policy according to the preset fault processing policy for processing the fault and the external fault related information.
  • the acquiring unit 710 is specifically configured to acquire fault information of other functional entities.
  • the processing unit 720 is specifically configured to acquire fault information of other functional entities in the management scope, and the foregoing functional entities and The other functional entities in the management scope perform the local fault association processing to obtain the local fault association information, and send the fault information and the local fault association information to the second function management entity, so that the second function management entity obtains the fault information and the network of the NFVI.
  • the fault information of other functional entities under the NS is serviced, and the external fault association process is performed on the functional entity, the other faulty functional entity under the NS, and the NFVI, to obtain an external fault association. And formulating the foregoing fault handling policy based on the foregoing external fault association information; and receiving the fault handling policy sent by the second function management entity.
  • the processing unit 710 is specifically configured to send a timeout notification message to the second function management entity, where the timeout notification message carries a pre-processing for processing the fault, when the fault processing time is reached, and the fault processing policy is not completed.
  • the fault handling policy is configured to enable the second function management entity to formulate the fault processing policy according to the preset fault handling policy and the external fault association information.
  • the foregoing function manager 700 further includes:
  • a determining unit configured to determine whether the fault processing strategy is the same as the preset fault processing strategy, if the fault processing strategy is completed after the fault is processed according to the preset fault processing strategy;
  • the processing unit 720 is further configured to: if the determining unit determines that the fault processing policy is different from the preset fault processing policy, and further processes the fault according to the fault processing policy.
  • the function manager of the embodiment of the present invention triggers the fault correlation process when receiving the fault information and formulates the fault handling policy according to the fault association processing result, which is set due to the minimum requirement for the business continuity to the delay.
  • the fault processing time therefore, after the fault correlation processing is triggered, if the fault processing time arrives and the fault handling strategy is completed, the fault processing strategy can be directly used to deal with the fault, so that the problem can be solved from the root cause. Recovery failure. If the fault handling policy is not reached, the fault handling strategy is not implemented. The fault is processed by the preset fault handling strategy to ensure that the service is not interrupted. Then, after the fault handling strategy is developed, the fault is further used. The processing strategy handles the fault, which not only ensures that the service is not interrupted, but also solves the fault problem from the root cause, so that the service runs normally.
  • FIG. 8 is a schematic diagram of a function manager 800 according to another embodiment of the present invention.
  • the function manager 800 can include at least one bus 801, at least one processor 802 connected to the bus 801, and at least one connected to the bus 801.
  • the processor 802 calls the code stored in the memory 803 through the bus 801 to obtain the fault information of the functional entity, triggers the fault correlation processing according to the fault information, and formulates a fault processing strategy according to the fault correlation processing result; when the fault processing time arrives, If the fault handling strategy is completed, the fault is processed according to the fault handling strategy; if the fault handling strategy is not established, the root is The fault is processed according to a preset fault handling strategy, and the preset fault handling strategy is a policy formulated for causing a fault to the functional entity itself.
  • the processor 802 may be configured to obtain fault information of other functional entities in the management scope, perform local fault association processing on the functional entity and other functional entities in the management scope, and obtain local fault association information. According to the above local fault association information, a fault handling strategy is formulated.
  • the processor 802 may be configured to manage fault information of other functional entities in a range, perform local fault association processing on the functional entity and other functional entities in the management scope, and obtain local fault association information.
  • the functional entity, the other faulty functional entity under the NS, and the NFVI perform external fault correlation processing to obtain external fault association information; receive external fault association information sent by the second function management entity, and formulate the foregoing according to the external fault association information.
  • Troubleshooting strategy is
  • the processor 802 may arrive at a fault processing time, and when the fault processing strategy is not completed, the fault processing is determined according to the preset fault processing strategy for processing the fault and the external fault related information. Strategy.
  • the processor 802 may be configured to obtain fault information of other functional entities in the management scope, perform local fault association processing on the functional entity and other functional entities in the management scope, and obtain local fault association information. And sending the foregoing fault information and the local fault association information to the second function management entity, so that the second function management entity acquires the fault information of the NFVI and the fault information of other functional entities under the network service NS, and the function entity, the NS is The other faulty functional entity and the NFVI perform external fault correlation processing to obtain external fault association information, and formulate the fault processing strategy based on the external fault association information; and receive the fault processing strategy sent by the second function management entity.
  • the processor 802 may send a timeout notification message to the second function management entity when the fault processing time is reached, and the fault processing policy is not completed, and the timeout notification message carries the processing
  • the preset fault processing strategy of the fault is configured to enable the second function management entity to formulate a fault according to the preset fault processing policy and the external fault association information. Strategy.
  • the processor 802 processes the fault according to the preset fault processing policy
  • the fault processing policy is completed, it is determined whether the fault processing policy is the same as the preset fault processing policy. If the fault handling policy is different from the preset fault handling strategy, the fault is also processed according to the fault handling policy.
  • the foregoing memory 803 can be used to store the foregoing fault handling policy and preset fault handling strategy.
  • the embodiments of the present invention further provide related devices for implementing the foregoing solutions.
  • the disclosed apparatus may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the above units is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical or otherwise.
  • the units described above as separate components may or may not be physically separated.
  • the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in various embodiments of the present invention may be integrated in one processing unit. It is also possible that each unit physically exists alone, or two or more units may be integrated in one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the above-described integrated unit if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the above-described methods of various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Abstract

一种基于网络功能虚拟化的故障处理方法,可包括:第一功能管理实体获取功能实体的故障信息;所述第一功能管理实体根据所述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;在故障处理时间到达时,若所述故障处理策略制定完毕,所述第一功能管理实体则根据所述故障处理策略对故障进行处理;若所述故障处理策略未制定完毕,所述第一功能管理实体则根据预置故障处理策略对故障进行处理,所述预置故障处理策略为针对所述功能实体自身原因产生故障而制定的策略,用以确保故障处理过程中业务不被中断,提高用户体验。

Description

基于网络功能虚拟化的故障处理方法及设备 技术领域
本发明涉及通信技术领域,尤其涉及一种基于网络功能虚拟化(Network Function Virtualization,简称NFV)的故障处理方法及设备。
背景技术
网络功能虚拟化(Network Function Virtualization,简称NFV)由全球13个主要电信运营商发起,是众多设备商、信息技术(Information Technology,IT)厂商等参与的组织,旨在定义运营商网络功能虚拟化的需求和相关的技术报告,希望借鉴IT的虚拟化技术,利用通用的高性能大容量服务器、交换机和存储来实现部分网络功能的软件化。例如,各种类型的网络设备,如服务器、路由器、存储设备、内容分发网络(Content Delivery Network,简称CDN)、交换机等,都可以通过NFV技术实现软硬件分离,它们可以部署在数据中心、网络节点或者用户家中等。
现有NFV网络架构包括:网络功能虚拟化基础设施(Network Function Virtualization infrastructure,简称NFVI)、虚拟网络功能(Virtualized Network Function,简称VNF)、网元管理(Element Management,简称EM)、虚拟网络功能管理器(VNF Manager,简称VNFM)、虚拟基础设施管理器(Virtualized Infrastructure Manager,简称VIM)、网络功能虚拟化编排器(NFV Orchestrator,NFVO)、运营支撑系统(Operations Support System,简称OSS)或业务支撑系统(Business Support System,简称BSS)等功能节点。其中,VNF运行在NFVI上,一个EM对应一个或多个VNF。
上述NFV网络架构在处理VNF故障时,通常采用主/备冗余模式,即一个VNF通常包括有主VNF和备用VNF,在业务传输开始时,采用主VNF,当主VNF发生故障时,将主VNF上的业务切换到备用VNF上,以保证业务正常运行。
然而,主VNF除了自身引起的故障,还可能是其它原因引起的故障,如,与主VNF相关联的其它VNF引起的故障,所以当主VNF不是由自身原因引起的故障时,即使启动备用VNF代替,也不能从根源上消除故障,反而浪费 宝贵的资源,增加管理实体的业务处理负担。
为了有效识别VNF的根源故障,现有技术中提供了一种VNF故障关联方法,该方法主要包括,VNF实体将故障信息发送到VNFM,其中,VNFM将该VNF与其管理范围内其它VNF进行故障关联处理,若能找到根源故障原因,则触发VNFM进行故障恢复,若未能找到根源故障,则由VNFM将故障信息发送给NFVO,由NFVO进行整个网络服务(Network Service,简称NS)层面的根源故障处理,如果需要还可以进一步将故障信息发送给OSS,由OSS更进一步进行故障关联。最后,将故障关联结果返回给VNFM,由VNFM进行故障恢复、
在上述故障关联方法中,可能需要多个功能节点进行故障关联,所需时间较长,如果所承载的业务对时延要求很严格的情况下,进行故障关联可能会严重影响业务的连续性,甚至导致业务中断,影响用户体验。
发明内容
针对上述缺陷,本发明提供了一种基于网络功能虚拟化的故障处理方法及设备,以确保在故障处理过程中业务不被中断,提高用户体验。
本发明第一方面提供了一种基于网络功能虚拟化的故障处理方法,可包括:
第一功能管理实体获取功能实体的故障信息;
上述第一功能管理实体根据上述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;
在故障处理时间到达时,若上述故障处理策略制定完毕,上述第一功能管理实体则根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,上述第一功能管理实体则根据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略。
结合第一方面,在第一种可能的实现方式中,上述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:上述第一功能管理实体获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;上述第一功能管理实体根据上述本地故障关联信息,制定上述故障处理策略。
结合第一方面,在第二种可能的实现方式中,上述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:上述第一功能管理实体获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;上述第一功能管理实体向第二功能管理实体发送上述故障信息和上述本地故障关联信息,以使上述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息;上述第一功能管理实体接收上述第二功能管理实体发送的上述外部故障关联信息,并根据上述外部故障关联信息制定上述故障处理策略。
结合第一方面的第二种可能的实现方式,在第三种可能的实现方式中,在故障处理时间到达,且上述故障处理策略未制定完毕时,上述根据上述外部故障关联信息制定上述故障处理策略包括:上述第一功能管理实体根据处理上述故障的预置故障处理策略以及上述外部故障关联信息制定上述故障处理策略。
结合第一方面,在第四种可能的实现方式中,上述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:上述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:上述第一功能管理实体获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;上述第一功能管理实体向第二功能管理实体发送上述故障信息和上述本地故障关联信息,以使上述第二功能管理获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于上述外部故障关联信息制定上述故障处理策略;上述第一功能管理实体接收上述第二功能管理实体发送的上述故障处理策略。
结合第一方面的第四种可能的实现方式,在第五种可能的实现方式中,在故障处理时间到达,且上述故障处理策略未制定完毕时,上述基于上述外部故障关联信息制定上述故障处理策略包括:上述第一功能管理实体向上述第二功能管理实体发送超时通知消息,上述超时通知消息携带有处理上述故障的预置 故障处理策略,以使上述第二功能管理实体根据上述预置故障处理策略与外部故障关联信息制定上述故障处理策略。
结合第一方面,或者第一方面的第一种可能的实现方式,或者第一方面的第二种可能的实现方式,或者第一方面的第三种可能的实现方式,或者第一方面的第四种可能的实现方式,或者第一方面的第五种可能的实现方式,在第六种可能的实现方式中,上述根据预置故障处理策略对故障进行处理之后,上述方法还包括:若上述故障处理策略制定完毕,上述第一功能管理实体确定上述故障处理策略与上述预置故障处理策略是否相同;若不相同,上述第一功能管理实体还根据上述故障处理策略处理上述故障。
本发明第二方面提供了一种功能管理器,可包括:
获取单元,用于获取功能实体的故障信息;
处理单元,用于根据上述获取单元获取的故障信息,触发故障关联处理以及根据故障关联处理结果制定故障处理策略,并在故障处理时间到达时,若上述故障处理策略制定完毕,则根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,则根据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略。
结合第二方面,在第一种可能的实现方式中,上述获取单元具体用于获取管理范围内其它功能实体的故障信息;上述处理单元具体用于,根据上述故障信息和上述管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;根据上述本地故障关联信息,制定故障处理策略。
结合第二方面,在第二种可能的实现方式中,上述获取单元具体用于获取管理范围内其它功能实体的故障信息;上述处理单元具体用于,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和上述本地故障关联信息,以使上述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息;接收上述第二功能管理实体发送的外部故障关联信息,并根据上述外部 故障关联信息制定故障处理策略。
结合第二方面的第二种可能的实现方式,在第三种可能的实现方式中,上述处理单元具体用于,在故障处理时间到达,且上述故障处理策略未制定完毕时,根据处理上述故障的预置故障处理策略以及上述外部故障关联信息制定上述故障处理策略。
结合第二方面,在第四种可能的实现方式中,上述获取单元具体用于,获取管理范围内其它功能实体的故障信息;上述处理单元具体用于,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和上述本地故障关联信息,以使上述第二功能管理实体获取NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于上述外部故障关联信息制定上述故障处理策略;接收上述第二功能管理实体发送的故障处理策略。
结合第二方面的第四种可能的实现方式,在第五种可能的实现方式中,上述处理单元具体用于,在故障处理时间到达,且上述故障处理策略未制定完毕时,向上述第二功能管理实体发送超时通知消息,上述超时通知消息携带有处理上述故障的预置故障处理策略,以使上述第二功能管理实体根据上述预置故障处理策略与外部故障关联信息制定上述故障处理策略。
结合第二方面,或者第二方面的第一种可能的实现方式,或者第二方面的第二种可能的实现方式,或者第二方面的第三种可能的实现方式,或者第二方面的第四种可能的实现方式,或者第二方面的第五种可能的实现方式,在第六种可能的实现方式中,上述功能管理器还包括:确定单元,用于在根据预置故障处理策略对故障进行处理之后,若上述故障处理策略制定完毕,确定上述故障处理策略与上述预置故障处理策略是否相同;上述处理单元还用于,若上述确定单元确定出上述故障处理策略与上述预置故障处理策略不相同,还根据上述故障处理策略处理上述故障。
本发明第三方面提供了一种功能管理器,可包括:至少一个处理器和至少一个存储器,其中,上述处理器与上述存储器通过至少一个总线相连;
上述处理器用于获取功能实体的故障信息,根据上述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;在故障处理时间到达时,若上述故障处理策略制定完毕,则根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,则根据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略;
上述存储器用于存储上述预置故障处理策略和制定的上述故障处理策略。
结合第三方面,在第一种可能的实现方式中,上述处理器具体用于,获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;根据上述本地故障关联信息,制定上述故障处理策略。
结合第三方面,在第二种可能的实现方式中,上述功能管理器具体用于,获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和上述本地故障关联信息,以使上述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息;接收上述第二功能管理实体发送的上述外部故障关联信息,并根据上述外部故障关联信息制定上述故障处理策略。
结合第三方面的第二种可能的实现方式,在第三种可能的实现方式中,上述功能管理器具体用于,根据处理上述故障的预置故障处理策略以及上述外部故障关联信息制定上述故障处理策略。
结合第三方面,在第四种可能的实现方式中,上述功能管理器具体用于,获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和上述本地故障关联信息,以使上述第二功能管理获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于上述外部故 障关联信息制定上述故障处理策略;接收上述第二功能管理实体发送的上述故障处理策略。
结合第三方面的第四种可能的实现方式中,在第五种可能的实现方式中,上述功能管理器具体用于,向上述第二功能管理实体发送超时通知消息,上述超时通知消息携带有处理上述故障的预置故障处理策略,以使上述第二功能管理实体根据上述预置故障处理策略与外部故障关联信息制定上述故障处理策略。
结合第三方面,或者第三方面的第一种可能的实现方式,或者第三方面的第二种可能的实现方式,或者第三方面的第三种可能的实现方式,或者第三方面的第四种可能的实现方式,或者第三方面的第五种可能的实现方式,在第六种可能的实现方式中,上述处理器还用于,若上述故障处理策略制定完毕,确定上述故障处理策略与上述预置故障处理策略是否相同;若不相同,还根据上述故障处理策略处理上述故障。
可以看出,在本发明实施例的一些技术方案中,第一功能管理实体在接收到功能实体的故障信息后,触发故障关联处理和根据故障关联处理结果来制定故障处理策略。在故障处理时间到达时,若该故障处理策略制定完毕,则根据故障处理策略处理故障,若该联故障处理策略未制定完毕,则根据预置故障处理策略处理故障,其中,预置故障处理策略仅是针对功能实体自身原因引起的故障所制定的策略,因此,在本发明中通过设定故障处理时间来确保业务在故障处理过程中不被中断,提高用户体验。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的NFV网络架构示意图;
图2为本发明实施例提供的基于网络功能虚拟化的故障处理方法的流程示意图;
图3为本发明另一实施例提供的基于网络功能虚拟化的故障处理方法的 流程示意图;
图4为本发明另一实施例提供的基于网络功能虚拟化的故障处理方法的流程示意图;
图5为本发明另一实施例提供的基于网络功能虚拟化的故障处理方法的流程示意图;
图6为本发明另一实施例提供的基于网络功能虚拟化的故障处理方法的流程示意图;
图7为本发明实施例提供的功能管理器的结构示意图;
图8为本发明另一实施例提供的功能管理器的结构示意图。
具体实施方式
本发明实施例提供的基于网络功能虚拟化的故障处理方法,以确保故障处理过程中业务不被中断,提高用户体验。本发明实施例还相应提供了功能管理器。
为使本发明的发明目的、特征、优点能够更加的明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述,显然,下面所描述的实施例仅仅是本发明一部分实施例,而非全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”等是用于区别不同的对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
首先结合图1对本发明实施例的技术方案所基于的NFV网络架构进行详细说明。
图1的NFV网络架构的功能实体、功能管理实体及其功能分别如下:
业务支撑系统BSS/运营支撑系统OSS,向NFVO发起服务请求,及服务所 需要的资源,负责故障处理。
网络功能虚拟化编排器NFVO,接收BSS/OSS发送的服务请求,分配管理资源;实时监测VNF、NFVI资源及运行状态信息,例如,故障信息。
虚拟网络功能管理器VNFM,负责VNF生成周期管理,如启动、生存时间、VNF运行状态信息,例如VNF的故障信息。
虚拟基础设施管理器VIM,负责管理、分配NFVI的资源,监测收集NFVI运行状态信息,例如故障信息。
网元管理EM,负责VNF的管理,其中包括VNF的性能监控、业务配置等。
虚拟网络功能VNF,是业务完成转发或处理的功能实体,可以是虚拟移动性管理实体(Mobility Management Entity,简称MME)、虚拟的分组数据网网关(Packet Data Network Gateway,简称PGW)、vSwitch、虚拟防火墙等。
网络服务NS目录(NS catalog),存储所有上载的NS,支持网络服务描述器(Network Service Descriptor,简称NSD)、VLD、VNFFGD等部署模板的创建和管理。
虚拟网络功能VNF目录(VNF catalog),存储所有上载的VNF包,支持虚拟网络功能描述器(VNF Descriptor,简称VNFD)、软件镜像、其他清单的创建和管理
网络功能虚拟化NFV实体(NFV instances),存储所有VNF实体和NS实体的信息。
网络功能虚拟化基础设施NFVI资源(NFVI resources),存储NFVI可用/预留/已分配的资源信息。
在图1的NFV网络架构中还提供了相关交互接口,具体如下:
Ve-Vnfm,VNF生命周期管理、交互配置信息;
Or-Vnfm,VNF生命周期管理请求资源,发送配置信息、收集状态信息;
Vi-Vnfm,资源分配请求,虚拟化资源配置和状态信息交;
Or-Vi,资源预留、分配请求,虚拟化资源配置和状态信息交互;
Nf-Vi,资源具体分配,虚拟资源状态信息交互,硬件资源配置;
Os-Ma,VNF生存周期管理、service graph生存周期管理、策略管理等;
Vn-Nf,用于NFVI向VNF提供实际执行环境。
基于上述介绍,首先,本发明基于网络功能虚拟化的故障处理方法的一个实施例中,一种基于网络功能虚拟化的故障处理方法可包括:第一功能管理实体(如,VNFM)获取功能实体(如,VNF)的故障信息;上述第一功能管理实体根据上述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;在故障处理时间到达时,若上述故障处理策略制定完毕,上述第一功能管理实体则根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,上述第一功能管理实体则根据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略。
请参阅图2,图2为本发明的一个实施例提供的基于网络功能虚拟化的故障处理方法的流程示意图。如图2所示,本发明的一个实施例提供的基于网络功能虚拟化的故障处理方法可包括:
201、第一功能管理实体获取功能实体的故障信息;
其中,第一功能管理实体获取功能实体的故障信息可以通过多种方式获取,例如可以直接接收功能实体发送的故障信息,或者接收其它功能管理实体发送的有关该功能实体的故障信息,其它功能管理实体具备监测功能实体故障的功能。
优选地,上述故障信息至少包括:故障标识(Identity,简称ID)、故障类型、故障数据。
需要说明的是,第一功能管理实体可以为VNFM,功能实体为VNF。
202、上述第一功能管理实体根据上述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;
需要说明的是,由第一功能管理实体来触发故障关联处理和触发制定故障处理策略,另外,可以由第一功能管理实体进行故障关联处理和制定故障处理策略,当然,还可以由其它功能管理实体进行故障关联处理和制定故障处理策略,或者由第一功能管理实体与其它功能管理实体配合来进行故障关联处理和制定故障处理策略。
203、在故障处理时间到达时,若上述故障处理策略制定完毕,上述第一 功能管理实体则根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,上述第一功能管理实体则根据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略。
本发明的故障处理时间根据业务连续性对时延的最小要求来确定,以确保在故障处理时间内业务不被中断。
其中,是否到达故障处理时间可以通过计时方式来体现,例如,可以在第一功能管理实体中设置计时器,在接收到故障信息后,触发制定故障处理策略,同时启动计时器开始计时,然后通过确定计时器的计时数值是否满足故障处理时间来确定是否到达故障处理时间。或者,在其它功能管理实体中设置计时器,该其它功能管理实体具备监测功能实体的故障的功能,当监测到功能实体发生故障时,启动计时器开始计时,在计时器的计时数值满足故障处理时间时,向第一功能管理实体发送超时通知,第一功能管理实体接收到超时通知后,通过确定故障处理策略是否制定完毕来执行相应的故障策略。
需要说明的是,预置故障处理策略主要是针对功能实体自身原因而产生的故障所制定的策略。另外,预置故障处理策略预先存储在第一功能管理实体中,可以以列表方式进行存储,在该列表中至少包括:故障ID、故障类型和预置故障处理策略,例如表1所示:
表1
故障ID 故障类型 预置故障处理策略
因此,通过故障信息中包括的故障ID和/或故障类型从列表中查找到匹配的预置故障处理策略。
相应地,故障处理策略也可以以列表方式进行存储,在列表中除了故障ID和故障类型,还包括故障处理策略,该故障处理策略是在进行故障关联后,根据故障关联处理结果制定的策略,其主要是针对由其它功能实体、NFVI、硬件等引起的关联故障而制定的策略。如下表2所示:
表2
故障ID 故障类型 故障处理策略
可以看出,在本发明实施例中,第一功能管理实体在获取到功能实体的故 障信息后,触发故障关联处理以及根据故障关联处理结果,由于在故障处理时间内,若故障处理策略制定完毕,则用故障处理策略进行故障处理,以实现全面故障处理,从根源上解决故障问题,若故障处理策略未制定完毕,则先使用针对功能实体自身原因产生的故障的对应预置故障处理策略进行故障处理,满足业务连续性对时延的最小要求,从而确保业务在故障处理过程中不被中断,提高用户体验。
在本发明一些可能的实施例中,上述第一功能管理实体触发制定故障处理策略,可以包括:第一功能管理实体获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;上述第一功能管理实体根据上述本地故障关联信息,制定上述故障处理策略。
其中,第一功能管理实体管理着若干功能实体,可能会同时接收到多个功能实体发送的故障信息,在对功能实体进行故障关联时,确定其管理范围内是否有其它功能实体也发生了故障,若是,获取其它功能实体的故障信息,然后根据所有故障信息,对功能实体和其它也发生故障的功能实体进行本地故障关联处理,其中,若经过本地故障关联处理发现,是由其它功能实体发生故障而造成的故障,则可以根据处理出的本地故障关联信息制定故障处理策略、
需要说明的是,在进行本地故障关联时,是对某一个功能实体和其管理范围内其它功能实体进行故障关联,若确定是其它功能实体造成的故障时,则指示了是由于其它功能实体本身所造成的故障,此时,采用预置故障处理策略对其它功能实体的故障进行处理后,该功能实体则恢复正常,也就是对于该功能实体有可能不需要进行故障处理。
例如,当对功能实体A进行故障处理时,先功能实体A和功能实体B进行故障关联,处理出是由于功能实体B的故障从而引起功能实体A造成故障,采用预置故障处理策略对功能实体B进行故障处理后,功能实体A则恢复正常。
在本发明另一些可能的实施例中,上述触发制定故障处理策略,可以包括:上述第一功能管理实体获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关 联信息;上述第一功能管理实体向第二功能管理实体发送上述故障信息和本地故障关联信息,以使上述第二功能管理获取NFVI的故障信息和NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息;上述第一功能管理实体接收上述第二功能管理实体发送的外部故障关联信息,并根据上述外部故障关联信息制定故障处理策略。
其中,第一功能管理实体对功能实体与管理范围内发生故障的其它功能实体进行了本地故障关联处理后,然后将故障信息和本地故障关联信息一起发送给第二功能管理实体,第二功能管理实体是第一功能管理实体的上一层功能管理实体,能够对NS下的所有发生故障的功能实体以及NFVI进行故障关联。
进一步地,第二功能管理实体对上述故障信息所指示的功能实体、NS下的其它发生故障的功能实体以及NFVI进行故障关联处理,若仍未分析出根源故障,进一步将将上述故障信息、本地故障关联信息以及NS故障关联信息发送给第三功能管理实体,进而第三功能管理实体对功能实体与硬件进行关联处理,得到外部故障关联信息,并将外部故障关联信息发送给第二功能管理实体。
可以理解的是,在故障处理时间到达,且所述故障处理策略未制定完毕时,最后故障处理策略的制定还要参考在故障处理时间到达时用于处理故障的预置故障处理策略,即最后根据故障信息、本地故障关联信息、NS故障关联信息以及硬件故障信息等,制定故障处理策略。
上述是由第一功能管理实体制定故障处理策略,在一些实施例中,可以由第二功能管理实体制定故障处理策略,然后将故障处理策略发送给第一功能管理实体。
在本发明一些可能的实施例中,上述触发制定故障处理策略,可以包括:上述第一功能管理实体获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;上述第一功能管理实体向第二功能管理实体发送上述故障信息和本地故障关联信息,以使上述第二功能管理实体获取NFVI的故障信息和NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于 上述外部故障关联信息制定故障处理策略;上述第一功能管理实体接收上述第二功能管理实体发送的故障处理策略。
进一步地,进一步地,第二功能管理实体对上述故障信息所指示的功能实体、NS下的其它发生故障的功能实体以及NFVI进行故障关联处理,若仍未分析出根源故障,进一步将将上述故障信息、本地故障关联信息以及NS故障关联信息发送给第三功能管理实体,进而,第三功能管理实体对功能实体或与硬件进行关联处理,得到外部故障关联信息,并根据外部故障关联信息制定故障处理策略,向上述第二功能管理实体发送上述故障处理策略。
进一步地,在故障处理时间到达,且所述故障处理策略未制定完毕时,上述第一功能管理实体还向上述第二功能管理实体发送超时通知消息,上述超时通知消息携带有处理上述故障的预置故障处理策略,以使上述第二功能管理实体根据上述预置故障处理策略与外部故障关联信息制定故障处理策略。
可以理解的是,在故障处理时间到达,且所述故障处理策略未制定完毕时,最后故障处理策略的制定还要参考在故障处理时间到达时用于处理故障的预置故障处理策略,即最后根据故障信息、本地故障关联信息、NS故障关联信息以及硬件故障信息等,制定故障处理策略。
在本发明一些可能的实施例中,在上述第一功能管理实体则根据预置故障处理策略对故障进行处理之后,还包括:若上述故障处理策略制定完毕,上述第一功能管理实体确定上述故障处理策略与上述预置故障处理策略是否相同;若不相同,上述第一功能管理实体还根据上述故障处理策略处理故障。
可以理解的是,在故障处理时间到达时,若还没有制定出故障处理策略,先根据预置故障处理策略来进行故障处理,以满足业务连续性的最小时延要求,以防超过故障处理时延还没有进行故障处理,造成业务中断。然而同时,故障处理策略的制定过程还在进行,当故障处理策略制定完毕后,若故障处理策略不同于原来执行故障处理时的预置故障处理策略,说明故障的根源问题还没有解决,因此,需要进一步根据故障处理策略处理故障,以确保能够从根源上解决故障,以进行故障全面处理,保证业务正常运行。
为便于更好的理解和实施本发明实施例的上述方案,下面通过一些具体的应用场景进行举例说明。
请参阅图3,其中,图3为本发明的另一个实施例提供的一种基于网络功能虚拟化的故障处理方法的流程示意图。图3所对应的基于网络功能虚拟化的故障处理方法中在VNFM中设置计时器,并由VNFO制定故障处理策略,一种基于网络功能虚拟化的故障处理方法,可包括:
301、VNF将VNF的故障信息发送到VNFM,其中,上述故障信息中至少包括故障ID、故障类型和故障数据;
VNF与VNFM之间通过上述接口Ve-Vnfm交互。
302、VNFM接收到VNF的故障信息,启动计时器;
在VNFM中设置计时器,当VNFM获取到故障信息时,启动该计时器开始计时,同时开始执行下面的步骤303和304。
需要说明的是,步骤303和304是同时开始执行,但不一定同时完成。
303、VNFM根据上述故障信息查找对应的预置故障处理策略;
例如,对上述表1具体可以如下表3所示:
表3
Figure PCTCN2015074580-appb-000001
根据故障信息中的ID2,在列表中查找到预置故障处理策略的内容:新建VNF;将业务迁移到新VNF。即建立新的VNF来替换故障的VNF。
当然,在查找到VNF停止服务后,可以先执行VNF停止服务中的第一步骤,即新建VNF,等到故障处理时间到达时,若还未制定完毕故障处理策略,再将业务从故障VNF切换到新VNF上,以加快故障处理速度。
304、VNFM获取其它VNF的故障信息,对该VNF与其它VNF进行本地故障关联,得到本地故障关联信息;
需要说明,从步骤304开始到步骤309、以及311步骤为制定故障处理策略的过程。
305、VNFM将VNF的故障信息及本地故障关联信息发送到VNFO;
VNFM与VNFO之间通过上述Or-Vnfm接口进行交互。
306、VNFO获取至少一个NFVI的故障信息和NS下其它VNF的故障信息,根据故障信息、本地故障关联信息、其它VNF的故障信息以及NFVI的故障信息,对VNF、NS下的其它发生故障的VNF以及NFVI进行故障关联处理,得到NS关联信息;
其中,VNFO能够实时监测VNF和NFVI的运行状态,因此,VNFO能够具体NS层面的关联。
307、VNFO向OSS/BSS发送故障信息、本地故障关联信息以及NS关联信息;
308、OSS/BSS获取硬件故障信息,进行硬件关联处理,得到外部故障关联信息;
309、OSS/BSS向VNFO发送外部故障关联信息。
310、若此时计时器计时的数值为故障处理时间,VNFM向VNFO发送超时通知消息,该超时通知消息包含预置故障处理策略,并根据预置故障处理策略对故障进行处理;
其中,若在查找到预置故障处理策略后,已经实例化了新VNF,则在该步骤中,只需要向VNF发送业务切换的通知,将在原来故障VNF上运行的业务切换到新VNF上。
需要说明,步骤302启动计时器开始计时,并同时从步骤304开始故障处理策略的制定,到执行完步骤309时,检测到计时器计时的数值为故障处理时间,但此时,故障处理策略还未制定完毕,因此,需要先根据预置故障处理策略对故障进行处理。
需要说明,上述计时器计时的数值为故障处理时间是指,例如,故障处理时间设定为3毫秒,且计时器从0开始计时,当计时的数值刚好为3毫秒时,表示计时器计时的数值为故障处理时间。
311、VNFO根据外部故障关联信息和预置故障处理策略,制定故障处理策略;
需要说明,在步骤310中计时器计时的数值已经为故障处理时间,先根据预置故障处理策略处理故障,同时,还将在步骤309的基础上继续故障处理策略的制定,而在步骤311中完成了故障处理策略的制定。
312、VNFO向VNFM发送上述故障处理策略;
313、VNFM根据上述故障处理策略对故障进行处理。
在本发明实施例中,在计时器计时的数值为故障处理时间,但是还没有制定完毕故障处理策略,因此先根据预置故障处理策略对故障进行处理,以保证业务不会因为过长的故障处理时间而造成中断,并在故障处理策略制定完成后,进一步根据故障处理策略对故障进行处理,以保证对故障全面处理,从根源上解决故障问题。
请参阅图4,图4为本发明另一个实施例提供的一种基于网络功能虚拟化的故障处理方法的流程示意图。图4所对应的基于网络功能虚拟化的故障处理方法中在VNFM中设置计时器,并由VNFM制定故障处理策略,一种基于网络功能虚拟化的故障处理方法,可包括:
步骤401~409与上述步骤301-309相同,在此不再赘述。
410、VNFO向VNFM发送外部故障关联信息;
411、若此时计时器计时的数值为故障处理时间,VNFM根据预置故障处理策略对故障进行处理;
其中,若在查找到预置故障处理策略后,已经实例化了新VNF,则在该步骤中,只需要向VNF发送业务切换的通知,将在原来故障VNF上运行的业务切换到新VNF上。
需要说明,步骤402启动计时器开始计时,并同时从步骤404开始故障处理策略的制定,到执行完步骤409时,检测到计时器计时的数值为故障处理时间,但此时,故障处理策略还未制定完毕,因此,需要先根据预置故障处理策略对故障进行处理。
需要说明,上述计时器计时的数值为故障处理时间是指,例如,故障处理时间设定为3毫秒,且计时器从0开始计时,当计时的数值刚好为3毫秒时,表示计时器计时的数值为故障处理时间。
412、VNFM根据外部故障关联信息和预置故障处理策略,制定故障处理策略;
需要说明,在故障处理时间到达时,还没有制定完毕故障处理策略,且此时已经根据预置故障处理策略进行了故障处理,则故障处理策略的制定需要进 一步考虑已经用于解决故障的预置故障处理策略。
413、VNFM根据所述故障处理策略对故障进行处理。
本发明实施例与附图3所示的实施例的不同点在于:在本发明实施例中,由VNFM来制定故障处理策略。
请参阅图5,其中,图5为本发明的另一个实施例提供的一种基于网络功能虚拟化的故障处理方法的流程示意图。图5所对应的基于网络功能虚拟化的故障处理方法中在EM中设置计时器,并由VNFO制定故障处理策略,一种基于网络功能虚拟化的故障处理方法,可包括:
501、VNF向VNFM发送VNF的故障信息,其中,该故障消息中至少包括故障ID、故障类型和故障数据;
502、EM检测VNF是否发生故障,或者VNF向EM发送VNF的故障信息,启动计时器;
503~509与上述附图3所示的实施例中步骤303~309相同,在此不再赘述;
510、EM检测到计时器计时的数值为故障处理时间,向VNFM发送超时通知消息;
511、VNFM接收到超时通知消息,根据预置故障处理策略处理故障;
512、VNFM向VNFO发送超时通知消息,所述超时通知消息携带所属预置故障处理策略;
上述511和512可以同时执行。
513、VNFO根据预置故障处理策略和外部故障关联信息制定故障处理策略;
514、VNFO向VNFM发送故障处理策略;
515、VNFM根据故障处理策略对故障进行处理。
在本发明实施例中,在EM中设置计时器,故障处理策略由VNFO来制定。在EM检测到计时器计时的数值满足了故障处理时间时,向VNFM发送超时通知消息,此时还没有制定完毕故障处理策略,因此VNFM先根据预置故障处理策略对故障进行处理,以保证业务不会因为过长的故障处理时间而造成中断,并在故障处理策略制定完成后,VNFM进一步根据故障处理策略对故障进行处理,以保证对故障全面处理,从根源上解决故障问题。
请参阅图6,其中,图6为本发明的另一个实施例提供的一种基于网络功能虚拟化的故障处理方法的流程示意图。图6所对应的基于网络功能虚拟化的故障处理方法中在EM中设置计时器,并由VNFM制定故障处理策略,一种基于网络功能虚拟化的故障处理方法,可包括:
601、VNF向VNFM发送VNF的故障信息,其中,该故障消息中至少包括故障ID、故障类型和故障数据;
602、EM检测VNF发生故障或者接收VNF发送的故障信息后,启动计时器;
603~609与上述附图5所示的实施例中步骤503~509相同,在此不再赘述;
610、VNFO向VNFM发送外部故障关联信息;
611、EM检测到计时器计时的数值为故障处理时间,向VNFM发送超时通知消息;
612、VNFM接收到超时通知消息,执行预置故障处理策略;
613、VNFM根据预置故障处理策略和外部故障关联信息制定故障处理策略;
614、VNFM进一步根据故障处理策略对故障进行处理。
本发明实施例与附图5所示的实施例的不同点在于:在本发明实施例中,由VNFM来制定故障处理策略。
请参阅图7,图7为本发明的一个实施例提供的一种功能管理器的结构示意图;如图7所示,一种功能管理器700,可包括:
获取单元710,用于获取功能实体的故障信息;;
处理单元720,用于根据上述通信单元获取的故障信息,触发故障关联处理以及根据故障关联处理结果制定故障处理策略,并在故障处理时间到达时,若上述故障处理策略制定完毕,则根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,则根据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略。
可以看出,获取单元710获取功能实体的故障信息,处理单元720根据故障信息触发故障关联处理和根据故障关联处理结果来制定故障处理策略。在故障处理时间到达时,若该故障处理策略制定完毕,处理单元720根据故障处理策 略处理故障,若该联故障处理策略未制定完毕,处理单元720根据预置故障处理策略处理故障,其中,预置故障处理策略仅是针对功能实体自身原因引起的故障所制定的策略,因此,在本发明中通过设定故障处理时间来确保业务在故障处理过程中不被中断,提高用户体验。
在本发明一些可能的实施例中,上述获取单元710具体用于获取管理范围内其它功能实体的故障信息;上述处理单元720具体用于,根据上述故障信息和上述管理范围内其它功能实体的故障信息,对上述功能实体与管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;根据上述本地故障关联信息,制定故障处理策略。
在本发明另一些可能的实施例中,上述获取单元710具体用于获取管理范围内其它功能实体的故障信息;上述处理单元720具体用于,根据上述故障信息和上述管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和本地故障关联信息,以使上述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联信息;接收上述第二功能管理实体发送的外部故障关联信息,并根据上述外部故障关联信息制定故障处理策略。
进一步地,在故障处理时间到达,且上述故障处理策略未制定完毕时,上述处理单元720具体用于,根据处理上述故障的预置故障处理策略以及上述外部故障关联信息制定故障处理策略。
在本发明另一些可能的实施例中,上述获取单元710具体用于获取其它功能实体的故障信息;上述处理单元720具体用于,获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和本地故障关联信息,以使上述第二功能管理实体获取NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及上述NFVI进行外部故障关联处理,得到外部故障关联 信息,并基于上述外部故障关联信息制定上述故障处理策略;接收上述第二功能管理实体发送的故障处理策略。
进一步地,在故障处理时间到达,且上述故障处理策略未制定完毕时,上述处理单元710具体用于,向上述第二功能管理实体发送超时通知消息,上述超时通知消息携带有处理上述故障的预置故障处理策略,以使上述第二功能管理实体根据上述预置故障处理策略与外部故障关联信息制定上述故障处理策略。
在本发明一些可能的实施例中,上述功能管理器700还包括:
确定单元,用于在根据预置故障处理策略对故障进行处理之后,若上述故障处理策略制定完毕,确定上述故障处理策略与上述预置故障处理策略是否相同;
上述处理单元720还用于,若上述确定单元确定出上述故障处理策略与上述预置故障处理策略不相同,还根据上述故障处理策略处理上述故障。
由此可见,本发明实施例的功能管理器通过在接收到故障信息时,触发故障关联处理以及根据故障关联处理结果来制定故障处理策略,由于针对业务连续性对时延的最小要求,设定了故障处理时间,因此,在触发了故障关联处理之后,若故障处理时间到达,故障处理策略制定完毕,则可以直接采用制定出的故障处理策略对故障进行处理,从而能够从根源上解决问题,恢复故障。若在故障处理时间到达时,未制定完毕故障处理策略,先采用预置故障处理策略对故障进行处理,以保证业务不被中断,之后,在故障处理策略制定完毕后,再进一步使用制定的故障处理策略对故障进行处理,不仅确保业务不被中断,最后还能从根源上解决了故障问题,使得业务正常运行。
参见图8,图8为本发明另一实施例提供的功能管理器800的示意图,功能管理器800可包括至少一个总线801、与总线801相连的至少一个处理器802以及与总线801相连的至少一个存储器803。
其中,处理器802通过总线801调用存储器803存储的代码以获取功能实体的故障信息,根据所故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;在故障处理时间到达时,若上述故障处理策略制定完毕,根据上述故障处理策略对故障进行处理;若上述故障处理策略未制定完毕,根 据预置故障处理策略对故障进行处理,上述预置故障处理策略为针对上述功能实体自身原因产生故障而制定的策略。
在本发明一些可能的实施例中,上述处理器802可用于获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;根据上述本地故障关联信息,制定故障处理策略。
在本发明一些可能的实施例中,上述处理器802可用于管理范围内获取其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和本地故障关联信息,以使上述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及NFVI进行外部故障关联处理,得到外部故障关联信息;接收上述第二功能管理实体发送的外部故障关联信息,并根据上述外部故障关联信息制定上述故障处理策略。
在本发明一些可能的实施例中,上述处理器802可在故障处理时间到达,且上述故障处理策略未制定完毕时,根据处理上述故障的预置故障处理策略以及上述外部故障关联信息制定故障处理策略。
在本发明一些可能的实施例中,上述处理器802可用于获取管理范围内其它功能实体的故障信息,对上述功能实体与上述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送上述故障信息和本地故障关联信息,以使上述第二功能管理实体获取NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对上述功能实体、上述NS下的其它发生故障的功能实体以及NFVI进行外部故障关联处理,得到外部故障关联信息,并基于上述外部故障关联信息制定上述故障处理策略;接收上述第二功能管理实体发送的故障处理策略。
在本发明一些可能的实施例中,上述处理器802可在故障处理时间到达,且上述故障处理策略未制定完毕时,向上述第二功能管理实体发送超时通知消息,上述超时通知消息携带有处理上述故障的预置故障处理策略,以使上述第二功能管理实体根据上述预置故障处理策略与外部故障关联信息制定故障处 理策略。
在本发明一些可能的实施例中,上述处理器802可在根据预置故障处理策略对故障进行处理之后,若上述故障处理策略制定完毕,确定上述故障处理策略与上述预置故障处理策略是否相同;若上述故障处理策略与上述预置故障处理策略不相同,还根据上述故障处理策略处理上述故障。
在本发明一些可能的实施例中,上述存储器803可用于存储上述故障处理策略和预置故障处理策略。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤也可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
为便于更好的实施本发明实施例的上述方案,本发明实施例还提供用于实施上述方案的相关装置。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例上述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (21)

  1. 一种基于网络功能虚拟化的故障处理方法,其特征在于,包括:
    第一功能管理实体获取功能实体的故障信息;
    所述第一功能管理实体根据所述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;
    在故障处理时间到达时,若所述故障处理策略制定完毕,所述第一功能管理实体则根据所述故障处理策略对故障进行处理;若所述故障处理策略未制定完毕,所述第一功能管理实体则根据预置故障处理策略对故障进行处理,所述预置故障处理策略为针对所述功能实体自身原因产生故障而制定的策略。
  2. 根据权利要1所述的方法,其特征在于,
    所述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:
    所述第一功能管理实体获取管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;
    所述第一功能管理实体根据所述本地故障关联信息,制定所述故障处理策略。
  3. 根据权利要求1所述的方法,其特征在于,
    所述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:
    所述第一功能管理实体获取管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;
    所述第一功能管理实体向第二功能管理实体发送所述故障信息和所述本地故障关联信息,以使所述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对所述功能实体、所述NS下的其它发生故障的功能实体以及所述NFVI进行外部故障关联处理,得到外部故障关联信息;
    所述第一功能管理实体接收所述第二功能管理实体发送的所述外部故障 关联信息,并根据所述外部故障关联信息制定所述故障处理策略。
  4. 根据权利要求3所述的方法,其特征在于,
    在故障处理时间到达,且所述故障处理策略未制定完毕时,所述根据所述外部故障关联信息制定所述故障处理策略包括:
    所述第一功能管理实体根据处理所述故障的预置故障处理策略以及所述外部故障关联信息制定所述故障处理策略。
  5. 根据权利要求1所述的方法,其特征在于,
    所述触发故障关联处理以及根据故障关联处理结果制定故障处理策略包括:
    所述第一功能管理实体获取管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;
    所述第一功能管理实体向第二功能管理实体发送所述故障信息和所述本地故障关联信息,以使所述第二功能管理获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对所述功能实体、所述NS下的其它发生故障的功能实体以及所述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于所述外部故障关联信息制定所述故障处理策略;
    所述第一功能管理实体接收所述第二功能管理实体发送的所述故障处理策略。
  6. 根据权利要求5所述的方法,其特征在于,
    在故障处理时间到达,且所述故障处理策略未制定完毕时,所述基于所述外部故障关联信息制定所述故障处理策略包括:
    所述第一功能管理实体向所述第二功能管理实体发送超时通知消息,所述超时通知消息携带有处理所述故障的预置故障处理策略,以使所述第二功能管理实体根据所述预置故障处理策略与外部故障关联信息制定所述故障处理策略。
  7. 根据权利要求1~6任一项所述的方法,其特征在于,所述根据预置故障处理策略对故障进行处理之后,所述方法还包括:
    若所述故障处理策略制定完毕,所述第一功能管理实体确定所述故障处理策略与所述预置故障处理策略是否相同;
    若不相同,所述第一功能管理实体还根据所述故障处理策略处理所述故障。
  8. 一种功能管理器,其特征在于,包括:
    获取单元,用于获取功能实体的故障信息;
    处理单元,用于根据所述获取单元获取的故障信息,触发故障关联处理以及根据故障关联处理结果制定故障处理策略,并在故障处理时间到达时,若所述故障处理策略制定完毕,则根据所述故障处理策略对故障进行处理;若所述故障处理策略未制定完毕,则根据预置故障处理策略对故障进行处理,所述预置故障处理策略为针对所述功能实体自身原因产生故障而制定的策略。
  9. 根据权利要求8所述的功能管理器,其特征在于,
    所述获取单元具体用于获取管理范围内其它功能实体的故障信息;
    所述处理单元具体用于,根据所述故障信息和所述管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;根据所述本地故障关联信息,制定故障处理策略。
  10. 根据权利要求8所述的功能管理器,其特征在于,
    所述获取单元具体用于获取管理范围内其它功能实体的故障信息;
    所述处理单元具体用于,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送所述故障信息和所述本地故障关联信息,以使所述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对所述功能实体、所述NS下的其它发生故障的功能实体以及所述NFVI进行外部故障关联处理,得到外部故障关联信息;接收所述第二功能管理实体发送的外部故障关联信息,并根据所述外部故障关联信息制定故障处理策略。
  11. 根据权利要求10所述的功能管理器,其特征在于,
    所述处理单元具体用于,在故障处理时间到达,且所述故障处理策略未制定完毕时,根据处理所述故障的预置故障处理策略以及所述外部故障关联信息 制定所述故障处理策略。
  12. 根据权利要求8所述的功能管理器,其特征在于,
    所述获取单元具体用于,获取管理范围内其它功能实体的故障信息;
    所述处理单元具体用于,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送所述故障信息和所述本地故障关联信息,以使所述第二功能管理实体获取NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对所述功能实体、所述NS下的其它发生故障的功能实体以及所述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于所述外部故障关联信息制定所述故障处理策略;接收所述第二功能管理实体发送的故障处理策略。
  13. 根据权利要求12所述的功能管理器,其特征在于,
    所述处理单元具体用于,在故障处理时间到达,且所述故障处理策略未制定完毕时,向所述第二功能管理实体发送超时通知消息,所述超时通知消息携带有处理所述故障的预置故障处理策略,以使所述第二功能管理实体根据所述预置故障处理策略与外部故障关联信息制定所述故障处理策略。
  14. 根据权利要求8~13任一项所述的功能管理器,其特征在于,
    所述功能管理器还包括:
    确定单元,用于在根据预置故障处理策略对故障进行处理之后,若所述故障处理策略制定完毕,确定所述故障处理策略与所述预置故障处理策略是否相同;
    所述处理单元还用于,若所述确定单元确定出所述故障处理策略与所述预置故障处理策略不相同,还根据所述故障处理策略处理所述故障。
  15. 一种功能管理器,其特征在于,包括:至少一个处理器和至少一个存储器,其中,所述处理器与所述存储器通过至少一个总线相连;
    所述处理器用于获取功能实体的故障信息,根据所述故障信息,触发故障关联处理,并根据故障关联处理结果制定故障处理策略;在故障处理时间到达时,若所述故障处理策略制定完毕,则根据所述故障处理策略对故障进行处理;若所述故障处理策略未制定完毕,则根据预置故障处理策略对故障进行处理,所述预置故障处理策略为针对所述功能实体自身原因产生故障而制定的策略;
    所述存储器用于存储所述预置故障处理策略和制定的所述故障处理策略。
  16. 根据权利要求15所述的功能管理器,其特征在于,
    所述处理器具体用于,获取管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;根据所述本地故障关联信息,制定所述故障处理策略。
  17. 根据权利要求15所述的功能管理器,其特征在于,
    所述功能管理器具体用于,获取管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送所述故障信息和所述本地故障关联信息,以使所述第二功能管理实体获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对所述功能实体、所述NS下的其它发生故障的功能实体以及所述NFVI进行外部故障关联处理,得到外部故障关联信息;接收所述第二功能管理实体发送的所述外部故障关联信息,并根据所述外部故障关联信息制定所述故障处理策略。
  18. 根据权利要求17所述的功能管理器,其特征在于,
    所述功能管理器具体用于,根据处理所述故障的预置故障处理策略以及所述外部故障关联信息制定所述故障处理策略。
  19. 根据权利要求15所述的功能管理器,其特征在于,
    所述功能管理器具体用于,获取管理范围内其它功能实体的故障信息,对所述功能实体与所述管理范围内其它功能实体进行本地故障关联处理,得到本地故障关联信息;向第二功能管理实体发送所述故障信息和所述本地故障关联信息,以使所述第二功能管理获取网络功能虚拟化基础设施NFVI的故障信息和网络服务NS下其它功能实体的故障信息,对所述功能实体、所述NS下的其它发生故障的功能实体以及所述NFVI进行外部故障关联处理,得到外部故障关联信息,并基于所述外部故障关联信息制定所述故障处理策略;接收所述第二功能管理实体发送的所述故障处理策略。
  20. 根据权利要求19所述的功能管理器,其特征在于,
    所述功能管理器具体用于,向所述第二功能管理实体发送超时通知消息,所述超时通知消息携带有处理所述故障的预置故障处理策略,以使所述第二功 能管理实体根据所述预置故障处理策略与外部故障关联信息制定所述故障处理策略。
  21. 根据权利要求15~20任一项所述的功能管理器,其特征在于,
    所述处理器还用于,若所述故障处理策略制定完毕,确定所述故障处理策略与所述预置故障处理策略是否相同;若不相同,还根据所述故障处理策略处理所述故障。
PCT/CN2015/074580 2015-03-19 2015-03-19 基于网络功能虚拟化的故障处理方法及设备 WO2016145653A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2015/074580 WO2016145653A1 (zh) 2015-03-19 2015-03-19 基于网络功能虚拟化的故障处理方法及设备
CN201580035071.8A CN106464541B (zh) 2015-03-19 2015-03-19 基于网络功能虚拟化的故障处理方法及设备
US15/708,388 US10565047B2 (en) 2015-03-19 2017-09-19 Troubleshooting method based on network function virtualization, and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/074580 WO2016145653A1 (zh) 2015-03-19 2015-03-19 基于网络功能虚拟化的故障处理方法及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/708,388 Continuation US10565047B2 (en) 2015-03-19 2017-09-19 Troubleshooting method based on network function virtualization, and device

Publications (1)

Publication Number Publication Date
WO2016145653A1 true WO2016145653A1 (zh) 2016-09-22

Family

ID=56918287

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/074580 WO2016145653A1 (zh) 2015-03-19 2015-03-19 基于网络功能虚拟化的故障处理方法及设备

Country Status (3)

Country Link
US (1) US10565047B2 (zh)
CN (1) CN106464541B (zh)
WO (1) WO2016145653A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11003516B2 (en) 2017-07-24 2021-05-11 At&T Intellectual Property I, L.P. Geographical redundancy and dynamic scaling for virtual network functions

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10116514B1 (en) * 2015-03-30 2018-10-30 Amdocs Development Limited System, method and computer program for deploying an orchestration layer for a network based on network function virtualization (NFV)
WO2017097356A1 (en) * 2015-12-09 2017-06-15 Telefonaktiebolaget Lm Ericsson (Publ) Technique for reporting and processing alarm conditions occurring in a communication network
CN109995568B (zh) * 2018-01-02 2022-03-29 中国移动通信有限公司研究院 故障联动处理方法、网元及存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101867958A (zh) * 2010-06-18 2010-10-20 中兴通讯股份有限公司 管理无线传感网终端的方法和系统
CN104170323A (zh) * 2014-04-09 2014-11-26 华为技术有限公司 基于网络功能虚拟化的故障处理方法及装置、系统

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7454761B1 (en) * 2002-12-20 2008-11-18 Cisco Technology, Inc. Method and apparatus for correlating output of distributed processes
US7284155B2 (en) * 2004-03-29 2007-10-16 Hewlett-Packard Development Company, L.P. Remote software support agent system with electronic documents for troubleshooting
US7490073B1 (en) * 2004-12-21 2009-02-10 Zenprise, Inc. Systems and methods for encoding knowledge for automated management of software application deployments
CN101420326B (zh) * 2008-12-02 2011-02-16 华为技术有限公司 实现故障恢复和数据备份的方法、系统和装置
US20100229022A1 (en) * 2009-03-03 2010-09-09 Microsoft Corporation Common troubleshooting framework
US20110170134A1 (en) * 2010-01-12 2011-07-14 Kabushiki Kaisha Toshiba Image forming apparatus and maintenance method of image forming apparatus
CN101799776A (zh) * 2010-02-25 2010-08-11 上海华为技术有限公司 多核处理器故障处理方法、多核处理器及通信设备
US20120117227A1 (en) * 2010-11-10 2012-05-10 Sony Corporation Method and apparatus for obtaining feedback from a device
US9223632B2 (en) * 2011-05-20 2015-12-29 Microsoft Technology Licensing, Llc Cross-cloud management and troubleshooting
GB2492328A (en) * 2011-06-24 2013-01-02 Ge Aviat Systems Ltd Updating troubleshooting procedures for aircraft maintenance
JP5684946B2 (ja) * 2012-03-23 2015-03-18 株式会社日立製作所 イベントの根本原因の解析を支援する方法及びシステム
CN102904610B (zh) * 2012-08-29 2014-12-10 华为技术有限公司 电力线通信中节点恢复方法和设备
JP2014064252A (ja) * 2012-09-24 2014-04-10 Hitachi Ltd ネットワークシステム、伝送装置、及び障害情報通知方法
US9069737B1 (en) * 2013-07-15 2015-06-30 Amazon Technologies, Inc. Machine learning based instance remediation
US9697067B2 (en) * 2013-11-08 2017-07-04 Hitachi, Ltd. Monitoring system and monitoring method
CN103607349B (zh) * 2013-11-14 2017-02-22 华为技术有限公司 虚拟网络中确定路由的方法及运营商边缘设备
CN104199753B (zh) * 2014-09-04 2018-05-29 中标软件有限公司 一种虚拟机应用服务故障恢复系统及其故障恢复方法
CN104410672B (zh) * 2014-11-12 2017-11-24 华为技术有限公司 网络功能虚拟化应用升级的方法、转发业务的方法及装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101867958A (zh) * 2010-06-18 2010-10-20 中兴通讯股份有限公司 管理无线传感网终端的方法和系统
CN104170323A (zh) * 2014-04-09 2014-11-26 华为技术有限公司 基于网络功能虚拟化的故障处理方法及装置、系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11003516B2 (en) 2017-07-24 2021-05-11 At&T Intellectual Property I, L.P. Geographical redundancy and dynamic scaling for virtual network functions

Also Published As

Publication number Publication date
CN106464541B (zh) 2019-09-20
CN106464541A (zh) 2017-02-22
US20180004589A1 (en) 2018-01-04
US10565047B2 (en) 2020-02-18

Similar Documents

Publication Publication Date Title
JP6026705B2 (ja) 更新管理システムおよび更新管理方法
EP3291499B1 (en) Method and apparatus for network service capacity expansion
Taleb et al. On service resilience in cloud-native 5G mobile systems
US10680874B2 (en) Network service fault handling method, service management system, and system management module
CA2970824C (en) System and method for elastic scaling using a container-based platform
JP6466003B2 (ja) Vnfフェイルオーバの方法及び装置
Bailis et al. The network is reliable: An informal survey of real-world communications failures
KR102439559B1 (ko) 경보 방법 및 디바이스
WO2016155394A1 (zh) 一种虚拟网络功能间链路建立方法及装置
US10698741B2 (en) Resource allocation method for VNF and apparatus
US10565047B2 (en) Troubleshooting method based on network function virtualization, and device
US10764939B2 (en) Network function processing method and related device
US11886904B2 (en) Virtual network function VNF deployment method and apparatus
JP6546340B2 (ja) クラウド展開における自動兆候データ収集
WO2018137520A1 (zh) 一种业务恢复方法及装置
US10120779B1 (en) Debugging of hosted computer programs
WO2019011142A1 (zh) 一种进行网络链路切换的方法和系统
WO2020108443A1 (zh) 一种虚拟化管理方法及装置
JP7260820B2 (ja) 処理装置、処理システム、処理方法、および処理プログラム
WO2023200891A1 (en) Microservices for centralized unit user plane (cu-up) and centralized unit control plane (cu-cp) standby pods in a cloud-native fifth generation (5g) wireless telecommunication network
JP2023527929A (ja) 仮想化ネットワーク・サービス配備方法及び装置
CN116016229A (zh) 一种部署容器服务的方法及装置
NFV ETSI GS NFV-REL 001 V1. 1.1 (2015-01)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15885036

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15885036

Country of ref document: EP

Kind code of ref document: A1