WO2018137520A1 - 一种业务恢复方法及装置 - Google Patents

一种业务恢复方法及装置 Download PDF

Info

Publication number
WO2018137520A1
WO2018137520A1 PCT/CN2018/072909 CN2018072909W WO2018137520A1 WO 2018137520 A1 WO2018137520 A1 WO 2018137520A1 CN 2018072909 W CN2018072909 W CN 2018072909W WO 2018137520 A1 WO2018137520 A1 WO 2018137520A1
Authority
WO
WIPO (PCT)
Prior art keywords
functional entity
entity
target
service recovery
service
Prior art date
Application number
PCT/CN2018/072909
Other languages
English (en)
French (fr)
Inventor
黄胜森
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018137520A1 publication Critical patent/WO2018137520A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2002Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where interconnections or communication control functionality are redundant
    • G06F11/2005Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where interconnections or communication control functionality are redundant using redundant communication controllers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/2028Failover techniques eliminating a faulty processor or activating a spare
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2048Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant where the redundant components share neither address space nor persistent storage
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0686Additional information in the notification, e.g. enhancement of specific meta-data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0895Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/40Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/22Alternate routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/815Virtual
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0894Policy-based network configuration management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • H04L45/247Multipath using M:N active or standby paths

Definitions

  • the present application relates to the field of communications technologies, and in particular, to a service recovery method and apparatus.
  • a traditional telecommunication network consists of a variety of dedicated hardware devices, and different network services require different hardware devices.
  • the telecommunication network has become more and more large and complex, which also brings many problems. For example, the development of new services is slower, the operation and maintenance of the system is complicated and the cost is high, and the resource utilization rate is high. Lower.
  • Network Function Virtualization (NFV) technology can solve the above problems well.
  • NFV technology can be understood as the migration of the functions of each network element used in the telecommunication network system from the current dedicated hardware platform to the general purpose.
  • each network element in the telecom network is transformed into a stand-alone application and deployed flexibly on a unified infrastructure platform built on standards-based servers, storage, and switches.
  • Virtualization technology can transform the common COTS computing/storage/network hardware devices into the virtual resources required by various applications such as the upper layer virtualized network function (VNF), and realize the decoupling between applications and hardware. .
  • VNF virtualized network function
  • the function entity of the upper layer of the functional entity (for example, VNF, NFV Infrastructure (NFVI)) that has failed the service captures the fault information, or the service fails.
  • the entity reports the fault information to the upper-layer functional entity. If the upper-layer functional entity cannot identify the fault information, the fault information is further reported to the upper-level functional entity, and finally the operation and maintenance personnel analyze the fault information to determine the specific fault cause. It may also need to combine the log, operation history and other information to determine the specific cause of the failure, and then take corresponding measures to recover the faulty service.
  • the embodiment of the present application provides a service recovery method and device, which can quickly and accurately implement service recovery, and effectively shorten the duration of service failure.
  • An embodiment of the present application provides a service recovery method, which is applied to a network that includes multiple functional entities, where the multiple functional entities include a first functional entity, and the method includes: the first functional entity acquires a first event notification message, where The first event notification message may be sent by another functional entity, or may be locally detected by the first function entity, where the first event notification message carries the identifier of the first target functional entity to be repaired, and the first functional entity passes the query. Determining, by the first target relationship entity, the first replaceable minimum function entity and the first service recovery policy, and performing the first service recovery policy on the first replaceable minimum function entity, thereby being fast Accurately realize the recovery of services and effectively shorten the duration of business failures.
  • the first replaceable minimum functional entity is the first target functional entity itself.
  • the first replaceable minimum functional entity is a second functional entity associated with the first target functional entity among the pre-created associations.
  • the first functional entity may query the resource redundancy mode corresponding to the first target functional entity from the first corresponding relationship set, and perform the first service on the first replaceable minimum functional entity according to the resource redundancy manner. Recovery strategy.
  • the resource redundancy mode may be any one of active standby redundancy, load sharing redundancy, thread pool redundancy, and network path redundancy.
  • the first function entity determines whether it can perform the first service recovery policy on the first replaceable minimum function entity, for example, according to the remaining amount of the resource, if the remaining amount of the resource meets the execution of the first service recovery policy.
  • the first functional entity determines that it can execute the first service recovery policy and performs the first service recovery policy on the first replaceable minimum function entity.
  • the multiple functional entities further include a third functional entity, and the first functional entity determines whether it can perform the first service recovery policy on the first replaceable minimum functional entity, for example, according to the remaining amount of the resource, if The remaining amount of the resource cannot meet the requirement of executing the first service recovery policy, and the first function entity determines that the first service entity cannot perform the first service recovery policy, and the first function entity may send the second event notification message to the third function entity, and the second The event notification message carries the identifier of the first target function entity, so that the third function entity can determine the second target function entity to be repaired according to the identifier of the first target function entity, and from the pre-created second correspondence relationship set.
  • the multiple function entities further include a fourth function entity, where the first function entity may further send a third event notification message to the fourth function entity, where the third event notification message carries the identifier of the first target function entity, such that The fourth functional entity may determine the third target functional entity to be repaired according to the identifier of the first target functional entity, and query the third target functional entity corresponding to the third replaceable minimum from the pre-created third corresponding relationship set.
  • the functional entity and the third service recovery policy, and the third service recovery policy is executed on the third replaceable minimum function entity, so that after the local service recovery of the first function entity, the other function entity may be further notified to resume the first target function.
  • the first function entity may determine whether the service of the first replaceable minimum function entity is successfully restored. If the recovery is not successful, the first function entity performs sending to the fourth function entity.
  • the step of the third event notification message may timely notify other functional entities to perform service recovery processing when the first functional entity performs the service recovery unsuccessfully, which can effectively shorten the duration of the service failure.
  • Another embodiment of the present application provides a service recovery apparatus, which is applied to a network including multiple functional entities, where the multiple functional entities include the service recovery apparatus, and the apparatus includes: an acquisition module, a query module, a processing module, and a sending Modules, each of the above modules are used to perform the methods described in the above aspects.
  • a still further aspect of the embodiments of the present application provides a service recovery apparatus, including: a processor, a transceiver, and a memory, wherein the processor, the transceiver, and the memory are connected by a bus, and the memory stores executable program code, and the transceiver
  • the device is controlled by the processor for transmitting and receiving messages, and the processor is configured to invoke the executable program code to perform the method described in the above aspects.
  • Yet another aspect of an embodiment of the present application provides a computer readable storage medium having instructions stored therein that, when executed on a computer, cause the computer to perform the methods described in the above aspects.
  • Yet another aspect of an embodiment of the present application provides a computer program product comprising instructions that, when run on a computer, cause the computer to perform the methods described in the various aspects above.
  • the first function entity obtains the first event notification message, where the first event notification message carries the identifier of the first target function entity to be repaired, and the first function entity is from the first corresponding relationship set created in advance. And querying, by the first target functional entity, the first replaceable minimum functional entity and the first service recovery policy, and performing the first service recovery policy on the first replaceable minimum functional entity, so that Fast and accurate business recovery, effectively shortening the duration of business failures.
  • FIG. 1 is a schematic structural diagram of an NFV network according to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a service recovery method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another service recovery method according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of still another service recovery method according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a service recovery apparatus according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a service recovery apparatus according to an embodiment of the present application.
  • FIG. 1 is a schematic structural diagram of an NFV network according to an embodiment of the present application.
  • the NFV network described in this embodiment may be a data center network, a service provider network, or a local area network (LAN), and may include the following functional entities: NFV Management and Orchestration (NFV). -MANO) 128, NFV Infrastructure (NFVI) 130, multiple virtual network functions (VNF) 108, and a plurality of Element Management System (EMS) 122.
  • NFV NFV Management and Orchestration
  • NFVI NFV Infrastructure
  • VNF virtual network functions
  • EMS Element Management System
  • the NFV network described in this embodiment may further include the following functional entities: service, VNF and Infrastructure Description (Service, VNF and Infrastructure Description) 126, and one or more Operational Support System/Business Support System (OSS/BSS) 124.
  • service VNF and Infrastructure Description
  • OSS/BSS Operational Support System/Business Support System
  • the architecture of the NFV network shown in FIG. 1 is only an example, and may also be an NFV network in other architectural forms, which is not specifically limited in this embodiment. among them:
  • the NFV-MANO 128 may include an NFV Orchestrator (NFVO) 102, one or more VNF Managers (VNFMs) 104, and one or more Virtualized Infrastructure Managers (VIMs) 106.
  • the NFVI 130 may include virtual hardware consisting of computing hardware 112, storage hardware 114, network hardware 116, virtualization layer, and virtual computing 110 (eg, Virtual Machine (VM)), virtual storage 118, and virtual network 120. Resource layer.
  • Computing hardware 112 can be a dedicated processor or a general purpose processor for providing processing and computing functions.
  • the storage hardware 114 is configured to provide storage capabilities, which may be provided by the storage hardware 114 itself (eg, a server's local memory), or may be provided over a network (eg, the server connects to a network storage device over a network).
  • Network hardware 116 may be a switch, router, and/or other network device, and network hardware 116 is used to enable communication between multiple devices, with wireless or wired connections between multiple devices.
  • the virtualization layer in NFVI130 is used to abstract the hardware resources of the hardware resource layer, decouple the VNF 108 from the physical layer to which the hardware resources belong, and provide virtual resources to the VNF 108.
  • Virtual resources may include virtual computing 110, virtual storage 118, and virtual network 120.
  • Virtual computing 110 virtual storage 118 may be provided to VNF 108 in the form of a virtual machine or other virtual container, such as VNF 108 may be deployed on a virtual machine or other virtual container.
  • the virtualization layer forms a virtual network 120 through abstract network hardware 116.
  • Virtual networks 120 such as virtual switches (e.g., vSwitches), are used to enable communication between multiple virtual machines, or between other types of virtual containers hosting VNFs.
  • Virtualization of network hardware can be virtualized by virtual LAN (Vritual LAN, VLAN), Virtual Private LAN Service (VPLS), Virtual eXtensible Local Area Network (VxLAN), or general routing encapsulation network ( Nerwork Virtualization using Generic Routing Encapsulation, NVGRE) and other technical implementations.
  • OSS/BSS124 is mainly for telecom service operators, providing comprehensive network management and service operation functions, including network management (such as fault monitoring, network information collection, etc.), billing management, and customer service management.
  • network management such as fault monitoring, network information collection, etc.
  • billing management billing management
  • customer service management customer service management.
  • the VNF and the infrastructure description 126 are described in detail in the ETSI GS NFV 002 v1.1.1 standard, and the details of the present application are not described herein again.
  • the NFV-MANO128 can be used to monitor and manage VNF108 and NFVI130.
  • the NFVO 102 can communicate with one or more VNFMs 104 to implement resource related requests, send configuration information to the VNFM 104, and collect status information for the VNF 108.
  • the NFVO 102 can also communicate with the VIM 106 to enable resource allocation, and/or to implement configuration and status information reservation and exchange of virtualized hardware resources.
  • the VNFM 104 can be used to manage one or more VNFs 108, performing various management functions such as initializing, updating, querying, and/or terminating the VNF 108.
  • the VIM 106 can be used to control and manage the interaction of the VNF 108 and computing hardware 112, storage hardware 114, network hardware 116, virtual computing 110, virtual storage 118, virtual network 120.
  • VIM 106 can be used to perform resource allocation operations to VNF 108.
  • the VNFM 104 and VIM 106 can communicate with one another to exchange virtualized hardware resource configuration and status information.
  • the NFV-MANO may be deployed on a general physical network device or a physical server, or may be deployed on a VM, which is not limited in this embodiment.
  • NFVI 130 includes both hardware and software that together create a virtualized environment to deploy, manage, and execute VNF108.
  • the hardware resource layer and the virtual resource layer are used to provide virtual resources, such as virtual machines and/or other forms of virtual containers, to the VNF 108.
  • the VNFM 104 can communicate with the VNF 108 and the EMS 122 to perform VNF lifecycle management and implement exchange of configuration information/status information.
  • the VNF 108 is a virtualization of at least one network function that was previously provided by a physical network device.
  • the EMS 122 can be used to manage one or more VNFs 108.
  • the multiple VNFs 108 together form a Network Service (NS) for use by the user.
  • Each VNF 108 can run one or more VNF components (VNFs, VNFCs), VNF 108.
  • An entity equivalent to a network node that is, a network element (NE).
  • the embodiment of the present application is configured with a corresponding relationship set for determining a minimum functional entity that can be replaced, where the corresponding relationship set includes a correspondence between a functional entity to be repaired and a minimum functional entity that can be replaced and a service recovery policy, by querying
  • the corresponding relationship set can quickly obtain the replaceable minimum functional entity and the corresponding service recovery policy corresponding to the functional entity to be repaired, and the corresponding relationship set can be configured when the NFV network is constructed. It is of course possible to configure the corresponding relationship set at other feasible times, which is not specifically limited in the embodiment of the present application.
  • the minimum functional entity that can be replaced has the characteristics of being automatically replaced by the NFV network and being minimally affected.
  • the service can be quickly restored by replacing the smallest functional entity that can be replaced.
  • the smallest functional entity can be replaced by the functional entity itself to be repaired.
  • the minimum functional entity can be replaced.
  • the functional entity associated with the functional entity to be repaired including the upper functional entity of the functional entity to be repaired.
  • the central processing unit fails, that is, the CPU is a functional entity to be repaired, but since the CPU cannot be automatically replaced by the NFV network, the host/server where the CPU is located cannot be automatically replaced by the NFV network.
  • the smallest functional entity that can be replaced is a VM or other form of virtual container running on the host/server where the CPU is located.
  • a service recovery policy is a specific implementation scheme used when recovering a service that can be replaced with a minimum functional entity.
  • the corresponding relationship set for determining the minimum functional entity that can be replaced may further include a correspondence between the functional entity to be repaired and the resource redundancy mode.
  • the resource redundancy mode is a form of setting redundant resources for each functional entity, and may include the form of active/standby redundancy, load sharing redundancy, thread pool redundancy, or network path redundancy, and querying the corresponding relationship set to obtain the function to be repaired.
  • the resource redundancy mode corresponding to the entity, and the corresponding service recovery strategy is executed for the minimum functional entity that can be replaced according to the resource redundancy mode that is queried.
  • the embodiment of the present application creates an association relationship set for determining an associated functional entity, where the association relationship set includes an association relationship between the functional entities, and may include an association relationship between the host and the VM, and between the VM and the VNFC. Association relationship and the relationship between VNF and VNFC.
  • the relationship between the host and the VM is a VM running on a host, and one or more VMs may be running on one host.
  • the relationship between the VM and the VNFC is a VNFC running on a VM. It can be a VNFC running on a VM.
  • the relationship between the VNF and the VNFC is a VNFC specifically included in a VNF, and may be one VNF including one or more VNFCs.
  • the association relationship may be generated when the NFV network is constructed, and updated and maintained during the operation of the NFV network to ensure that the association relationship may reflect the latest association relationship between the functional entities.
  • the association relationship set may be generated at other feasible times, which is not specifically limited in this embodiment.
  • the set of correspondences established by the embodiment of the present application for determining the minimum functional entity that can be replaced and the set of associations for determining the associated functional entity may be stored in a distributed manner on each functional entity, or may be centralized. This is not limited to the specific functional entity that can be accessed by the functional entities.
  • the event notification message may be generated, and the function entity in the NFV network may locally detect the event.
  • the notification message, or the event notification message sent by the other function entity the other function entity sends the event notification message, including: the lower layer function entity reports the event notification message, or the upper layer function entity sends the event notification message, or the same layer function entity Forward the event notification message.
  • the function entity determines the functional entity to be repaired from the plurality of functional entities according to the event notification message, and queries the to-be-repaired from the corresponding relationship set for determining the minimum functional entity that can be replaced.
  • the functional entity corresponding to the minimum functional entity and the corresponding service recovery policy can be replaced, and the corresponding service recovery policy can be executed by the minimum functional entity corresponding to the functional entity to be repaired, so that the service can be quickly restored when the service is damaged. And predicting the possible impact of the business, circumventing in advance to ensure the stable operation of the business.
  • the corresponding relationship set for determining the minimum functional entity that can be replaced may include the functional relationship to be repaired as shown in Table 1, the correspondence between the replaceable minimum functional entity and the service recovery policy, and may also include the to-be-repaired A correspondence between a functional entity, a minimum functional entity, a resource redundancy mode, and a service recovery policy.
  • Table 1 is an example, and the corresponding relationship may also include other items. The content of each item in the corresponding relationship is not limited to that shown in Table 1. The embodiment of the present application does not specifically limit.
  • the association relationship set for determining the associated functional entity may include, but is not limited to, the association relationship shown in Table 2, Table 3, and Table 4.
  • Table 2 is the association relationship between the host and the VM, that is, the VM running on a host.
  • Table 3 shows the relationship between the VM and the VNFC, that is, the VNFC running on one VM.
  • Table 4 shows the relationship between the VNF and the VNFC, that is, the VNFC included in a VNF.
  • FIG. 2 is a schematic flowchart of a service recovery method according to an embodiment of the present disclosure.
  • the method may be used in the NFV network shown in FIG. 1 or in other networks.
  • the service recovery method described in this embodiment is applied to a network including a plurality of functional entities, where the plurality of functional entities include a first functional entity and a fourth functional entity.
  • the NFV network is taken as an example to illustrate that the first functional entity is specifically a virtual machine monitor hypervisor in the NFV network, and the fourth functional entity is specifically a VNFM in the NFV network, and the method includes:
  • the hypervisor acquires a first event notification message, where the first event notification message carries an identifier of the first target functional entity to be repaired.
  • the hypervisor monitors the running status of the network interface card (NIC).
  • the hypervisor can obtain the first event notification message (ie, the NIC failure message), the first event notification.
  • the message carries an identifier of the first target functional entity (ie, the faulty NIC) to be repaired, and the hypervisor can determine, according to the identifier, that the first target functional entity to be repaired is the faulty NIC.
  • the hypervisor queries, from a pre-created first correspondence set, a first replaceable minimum functional entity and a first service recovery policy corresponding to the first target functional entity represented by the identifier.
  • the first correspondence set includes a correspondence between the first target function entity, the first replaceable minimum function entity, and the first service recovery policy.
  • the hypervisor determines the first replaceable minimum functional entity corresponding to the faulty NIC by querying the first corresponding relationship set (as shown in Table 1). For the faulty NIC, the corresponding first service recovery policy is that the Hypervisor deletes the faulty NIC from the binding.
  • the hypervisor performs the first service recovery policy on the first replaceable minimum functional entity.
  • the first correspondence relationship set further includes a correspondence between the first target function entity and a target resource redundancy mode.
  • the Hypervisor queries the first corresponding relationship set (as shown in Table 1) to query the target resource redundancy mode corresponding to the faulty NIC as binding, and the Hypervisor binds the faulty NIC according to the target resource redundancy mode. Deleted to execute the first business recovery strategy.
  • NIC1 and NIC2 are in a binding relationship.
  • the resource redundancy modes of NIC1 and NIC2 are both bound. When both NIC1 and NIC2 are normal, service traffic is shared by NIC1 and NIC2.
  • NCI1 fails, the Hypervisor takes NIC1 from The NIC1 is isolated from the fault, and the service traffic shared by NIC1 and NIC2 is borne by NIC2.
  • the hypervisor sends a third event notification message to the VNFM, where the VNFM receives the third event notification message, where the third event notification message carries an identifier of the first target functional entity.
  • the Hypervisor can quickly recover the service affected by the faulty NIC after the faulty NIC is removed from the binding, but at this time, the NFV network has not restored the normal state of resource redundancy, that is, the formation between the NICs.
  • the state of the binding the Hypervisor can notify the upper functional entity (such as VNFM) for further processing.
  • the hypervisor may send a third event notification message to the VNFM through the VIM, that is, the hypervisor first sends the third event notification message to the VIM, and then the VIM forwards the message to the VNFM, where the third event notification message carries the identifier of the faulty NIC.
  • the VNFM receives the third event notification message.
  • the hypervisor may also send the third event notification message directly to the VNFM.
  • the VNFM determines, according to the identifier of the first target functional entity, a third target functional entity to be repaired.
  • the VNFM determines, according to the identifier of the faulty NIC, a VM identifier that is affected by the faulty NIC, or after receiving the third event notification message, the VIM determines, according to the identifier of the faulty NIC.
  • the VM identifier affected by the faulty NIC is sent, and the VM identifier is sent to the VNFM, and the VNFM determines the third target functional entity to be repaired according to the association set of the functional entity for determining the association (as shown in Table 3). (ie, the VNFC associated with the VM corresponding to the VM identifier).
  • the VNFM queries, from a pre-created third correspondence set, a third replaceable minimum functional entity and a third service recovery policy corresponding to the third target functional entity, and the third replaceable minimum The functional entity performs the third service recovery policy.
  • the VNFM determines that the third replaceable minimum functional entity corresponding to the VNFC is the VNFC, and the corresponding third service, by querying the third corresponding relationship set (as shown in Table 1).
  • the recovery policy is that the VNFM performs service recovery processing according to the resource redundancy mode of the VNFC.
  • the VNFM performs the active/standby switchover, that is, the VNFM sends the configuration to the original standby VNFC, and the VM where the standby VNFC is located is not affected by the faulty NIC, so that the original standby VNFC is switched. Upgrade to the new primary VNFC, and the VNFM delivers the configuration to the original primary VNFC's forward (ie, the upper-level) VNFC, so that the forward VNFC switches the service traffic from the original primary VNFC to the new primary VNFC, achieving the original The isolation of the main VNFC.
  • the VNFM performs the isolation of the VNFC, that is, the VNFM sends the forwarded VNFC to the VNFC, so that the forward VNFC cuts off the traffic to the VNFC, and the VNFC is isolated. .
  • the VNFC needs to be repaired. If the NIC fails, the VNFC cannot be repaired, so the VNFC can be directly restored to the VNFC.
  • the VNFC is migrated by the VIM to the VM that is not affected by the faulty NIC. After the VNFC is migrated, the NFV network restores the normal state of resource redundancy, that is, the binding state between the NICs.
  • the hypervisor may determine whether the number of redundant resources (ie, functional NICs) on the current host can meet the requirement of executing the first service recovery policy. If it can be satisfied, the hypervisor locally performs the first service recovery policy on the failed NIC. If not, the hypervisor can notify the upper functional entity (for example, VNFM) to perform service recovery processing. For example, there are four NICs (NIC1, NIC2, NIC3, and NIC4) on the current host. The two are in a binding relationship. NCI1 is bound to NIC2, and NIC3 and NIC4 are bound. If the Hypervisor obtains the first event notification message.
  • VNFM upper functional entity
  • the Hypervisor cannot perform the corresponding service recovery policy for NIC1, NIC2, and NIC3 because there is only one NCI (the NIC4). If the functional entity to be repaired in Table 1 is the NIC of the host, the minimum functional entity needs to be upgraded to the VM on the current host. All VNFCs running on the VM need to be migrated to other hosts. The hypervisor can notify VIM performs the migration of VNFC to other hosts.
  • the corresponding relationship set for determining the minimum functional entity that can be replaced may be distributed storage, that is, each functional entity stores a corresponding relationship set locally, and the corresponding relationship sets stored by each functional entity may be different from each other. Only the partial correspondences in Table 1 are included, and the corresponding relationship may be associated with the self.
  • the corresponding relationship set stored in the Hypervisor may include only the corresponding functional entities in the table 1 to be repaired, and the VNFM stores the corresponding relationship.
  • the relationship set may include only the corresponding relationship between the functional entity to be repaired in Table 1 as the VNFC, the corresponding relationship of the VM, and the corresponding relationship of the VNF.
  • the corresponding relationship set stored in the VNFC may include only the functional entity to be repaired in Table 1 as the VNFC thread.
  • the corresponding relationship of the entities included in the link, the corresponding relationship set of the VIM storage may include only the correspondence between the functional entity to be repaired in Table 1 as the vNIC, the correspondence between the vSwitch, and the like.
  • the first relationship set, the third correspondence set, and the like in the embodiment of the present application may be the same correspondence set.
  • the corresponding relationship set is stored on a specified functional entity that can be globally accessed by each functional entity, and each functional entity can query the minimum functional entity and corresponding service by accessing the specified functional entity. Recovery strategy.
  • the Hypervisor when the NIC is faulty, the Hypervisor queries the smallest functional entity (ie, the faulty NIC) corresponding to the faulty NIC and the corresponding service recovery policy, and binds the faulty NIC according to the corresponding service recovery policy. Deleted, you can quickly recover the business affected by the failed NIC. Further, the hypervisor can notify the VNFM to perform service recovery processing on the VNFC that can be replaced by the faulty NIC, thereby realizing the layered fast and accurate recovery of the service, effectively shortening the duration of the service fault, and quickly recovering. The normal state of NIC resource redundancy.
  • FIG. 3 is a schematic flowchart diagram of another service recovery method according to an embodiment of the present application.
  • the method can be used in the NFV network shown in FIG. 1, and can also be used in other networks.
  • the service recovery method described in this embodiment is applied to a network including a plurality of functional entities, where the plurality of functional entities include a first functional entity and a fourth functional entity.
  • the NFV network is taken as an example to illustrate that the first functional entity is specifically a VNFC in the NFV network, and the fourth functional entity is specifically a VNFM in the NFV network, and the method includes:
  • the VNFC acquires a first event notification message, where the first event notification message carries an identifier of the first target functional entity to be repaired.
  • the VNFC can monitor the running status of each VNFC thread through a reliability monitoring program.
  • the VNFC can obtain a first event notification message (ie, a VNFC thread failure message), the first The event notification message carries an identifier of the first target function entity (ie, the faulty VNFC thread) to be repaired, and according to the identifier, the first target function entity to be repaired is determined to be the faulty VNFC thread.
  • the VNFC queries, from a pre-created first correspondence set, a first replaceable minimum functional entity and a first service recovery policy corresponding to the first target functional entity represented by the identifier.
  • the first correspondence set includes a correspondence between the first target function entity, the first replaceable minimum function entity, and the first service recovery policy.
  • the reliability module in the VNFC may determine the fault corresponding to the VNFC by querying the first corresponding relationship set (as shown in Table 1).
  • the minimum functional entity that can be replaced is the faulty VNFC, and the corresponding first service recovery policy replaces the faulty VNFC thread with the Active state VNFC thread in the thread pool of the VNFC.
  • the VNFC performs the first service recovery policy on the first replaceable minimum functional entity.
  • the first correspondence relationship set further includes a correspondence between the first target function entity and a target resource redundancy mode.
  • the VNFC queries the first corresponding relationship set (as shown in Table 1) to query the target resource redundancy mode corresponding to the faulty VNFC thread as a thread pool, and the VNFC uses the thread pool to activate according to the target resource redundancy mode.
  • the VNFC thread replaces the failed VNFC thread.
  • the VNFC determines whether the active VNFC thread is normal. The service corresponding to the faulty VNFC thread is processed and the service is successfully restored. If yes, the process ends; if not, the service is not restored after replacing the faulty VNFC thread with the Active VNFC thread in the thread pool, then the VNFC can The upper layer function entity (for example, EM, VNFM) is notified for further processing, which may be implemented by further performing the following steps 304-306 after the above steps 301-303.
  • the upper layer function entity for example, EM, VNFM
  • the VNFC sends a third event notification message to the VNFM, where the VNFM receives the third event notification message, and the third event notification The message carries an identification of the first target functional entity.
  • the VNFC may send a third event notification message to the VNFM through the EM, that is, the VNFC first sends the third to the EM.
  • the event notification message is forwarded by the EM to the VNFM, and the third event notification message carries the identifier of the VNFC corresponding to the faulty VNFC thread, and the VNFM receives the third event notification message.
  • the VNFM determines, according to the identifier of the first target functional entity, a third target functional entity to be repaired.
  • the VNFM determines, according to the identifier of the VNFC, a third target functional entity to be repaired, that is, a VNFC corresponding to the faulty VNFC thread, or the EM receives the third event notification message.
  • the third target functional entity to be repaired ie, the VNFC corresponding to the faulty VNFC thread
  • the identifier of the third target functional entity is sent to the VNFM, so that the VNFM obtains the to-be-repaired
  • the three-target functional entity is a VNFC corresponding to the faulty VNFC thread.
  • the VNFM queries, from a pre-created third correspondence set, a third replaceable minimum functional entity and a third service recovery policy corresponding to the third target functional entity, and the third replaceable minimum The functional entity performs the third service recovery policy.
  • the VNFM determines that the third replaceable minimum functional entity corresponding to the VNFC is the VNFC, and the corresponding third service, by querying the third corresponding relationship set (as shown in Table 1).
  • the recovery policy is that the VNFM performs service recovery processing according to the resource redundancy mode of the VNFC.
  • the VNFC resource redundancy mode can be used for the active/standby mode or the load balancing.
  • the specific manner for the VNFM to perform service recovery processing on the VNFC according to the resource redundancy mode of the VNFC can be referred to the description in step 206 of the method embodiment shown in FIG. I will not repeat them here.
  • the VNFC needs to be repaired, and the VNFC may be restarted for repair.
  • the VIM is notified to migrate the VNFC to implement the VNFC. Repair can reduce resource waste.
  • the EM may locally perform the service recovery processing on the VNFC according to the resource redundancy mode of the VNFC, and notify the VNFM pair after the service recovery process is completed.
  • the VNFC is repaired to implement a more refined layered service recovery process to further reduce the impact on the service.
  • a VNFC thread with a corresponding function can be added to the thread pool, thereby ensuring the normal state of the resource redundancy of the NFV network, that is, the thread pool redundancy of the VNFC.
  • the VNFC thread when the VNFC detects the internal VNFC thread failure, the VNFC thread queries the smallest functional entity (ie, the faulty VNFC thread) corresponding to the faulty VNFC thread, and the corresponding service recovery policy, and uses the thread according to the corresponding service recovery policy.
  • the active VNFC thread in the pool replaces the faulty VNFC thread, and can quickly recover the services affected by the faulty VNFC thread.
  • the VNFC may notify the VNFM to perform service recovery processing on the VNFC corresponding to the faulty VNFC thread, thereby implementing layered fast and accurate recovery of the service and avoiding
  • the service fault caused by the direct restart of the VNFC thread has a long duration, or the affected service range caused by directly restarting the VNFC is large, and the normal state of resource redundancy of the VNFC thread can be restored.
  • FIG. 4 is a schematic flowchart of still another service recovery method according to an embodiment of the present application.
  • the method may be used in the NFV network shown in FIG. 1 or in other networks.
  • the service recovery method described in this embodiment is applied to a network including a plurality of functional entities, where the plurality of functional entities include a first functional entity.
  • the following takes the NFV network as an example to illustrate that the first functional entity is specifically a VNFM in the NFV network, and the method includes:
  • the NFVI sends a fan failure alarm
  • the OSS/BSS receives the fan failure alarm.
  • the NFVI can detect that the CPU temperature continues to rise, and can send a fan failure warning to the OSS/BSS through VIM and NFVO, and the OSS/BSS receives the fan failure warning.
  • the NFVI sends the fan failure warning to the VIM, and the VIM determines, according to the corresponding relationship set shown in Table 1, the VMs corresponding to the smallest functional entity that can be replaced by the CPU, and the VIM determines according to the association relationship set shown in Table 2.
  • the VIM sends the host ID and the VM list to the NFVO, and the NFVO forwards it to the OSS/BSS.
  • the OSS/BSS determines that the host corresponding to the host ID needs to be maintained.
  • the OSS/BSS sends a first event notification message, where the VNFM receives the first event notification message, where the first event notification message carries an identifier of a first target functional entity to be repaired.
  • the service on the host needs to be migrated, and the OSS/BSS sends a first event notification message to the VNFM, and the VNFM receives the first event notification message, where the first event notification message carries the first to be repaired.
  • the identifier of a target functional entity ie, VM
  • the first event notification message identified by the first target functional entity ie, VM).
  • the upper layer functional entity (OSS/BSS, NFVO) sends an event notification message to the VNFM for service recovery processing
  • the upper-layer functional entity (OSS/BSS, NFVO) sends an event notification message to the VNFM for service recovery processing, or may be through the OSS/BSS when the operation and maintenance personnel need to perform the operation and maintenance operation.
  • the VNFM sends an event notification message for service recovery processing.
  • the VNFM queries, from a pre-created first correspondence set, a first replaceable minimum functional entity and a first service recovery policy corresponding to the first target functional entity represented by the identifier.
  • the first correspondence set includes a correspondence between the first target function entity, the first replaceable minimum function entity, and the first service recovery policy.
  • the VNFM may first determine the functional entity associated with the first target functional entity to be repaired according to the association relationship set (as shown in Table 3) for determining the associated functional entity, that is, the VM corresponding to the VM identifier.
  • the associated VNFC the VNFM determines that the first replaceable minimum functional entity corresponding to the VNFC is the VNFC, and the corresponding first service recovery policy is the VNFM resource according to the VNFC, by querying the first corresponding relationship set (as shown in Table 1). Redundant mode for business recovery processing.
  • the VNFM performs the first service recovery policy on the first replaceable minimum functional entity.
  • the VNFC resource redundancy mode can be used for the active/standby mode or the load balancing.
  • the specific manner for the VNFM to perform service recovery processing on the VNFC according to the resource redundancy mode of the VNFC can be referred to the description in step 206 of the method embodiment shown in FIG. I will not repeat them here.
  • the VNFC needs to be repaired. Because the fan of the host is faulty, the service of the host needs to be migrated, so that the VNFC is directly migrated. The VNFC is repaired and the VIM is migrated by VIM to the VMs of other hosts.
  • the VNFM sends a notification that the VNFC repair is successful to the NFVO.
  • the NFVO receives the notification that each VNFM sends all the VNFC repair success, the NFVO sends a notification of successful service recovery to the OSS/BSS. Otherwise, NFVO sends a notification of service recovery failure to OSS/BSS.
  • the OSS/BSS can send a work order to maintain the fan, including repairing or replacing the fan; when the NFVO sends a service recovery failure notification, the OSS/BSS can generate an alarm message, which is manually intervened. .
  • the early warning may be performed in advance, and the upper functional entity notifies the VNFM to perform pre-processing before hardware maintenance ( That is, the service migration operation can implement recovery processing such as service migration before the fault affects the service, effectively avoiding the impact of hardware failure on the service.
  • FIG. 5 is a schematic structural diagram of a service recovery apparatus according to an embodiment of the present application.
  • the service recovery device described in this embodiment includes:
  • the obtaining module 501 is configured to obtain a first event notification message, where the first event notification message carries an identifier of the first target functional entity to be repaired.
  • the querying module 502 is configured to query, from a pre-created first correspondence set, a first replaceable minimum functional entity corresponding to the first target functional entity represented by the identifier, and a first service recovery policy, where The corresponding relationship set includes a correspondence between the first target function entity, the first replaceable minimum function entity, and the first service recovery policy.
  • the processing module 503 is configured to perform the first service recovery policy on the first replaceable minimum function entity.
  • the first replaceable minimum functional entity is the first target functional entity, and the first target functional entity may be replaced by the network.
  • the first replaceable minimum functional entity is a pre-created association set
  • the second target entity associated with the first target functional entity the first target functional entity is not replaceable by the network
  • the relationship set includes an association relationship between the first target functional entity and the second functional entity.
  • the processing module 503 includes:
  • the query unit 5030 is configured to query, from the first correspondence relationship set, a target resource redundancy manner corresponding to the first target function entity.
  • the executing unit 5031 is configured to perform the first service recovery policy on the first replaceable minimum functional entity according to the target resource redundancy manner.
  • the first correspondence relationship set further includes a correspondence between the first target function entity and the target resource redundancy mode.
  • processing module 503 is specifically configured to:
  • processing module 503 is specifically configured to:
  • the first service recovery policy is not executed, sending a second event notification message to the third function entity, where the second event notification message carries the identifier of the first target function entity, so that the third function entity Determining, according to the identifier of the first target function entity, a second target function entity to be repaired, and querying, from the pre-created second correspondence set, a second replaceable minimum function corresponding to the second target function entity An entity and a second service recovery policy, and performing the second service recovery policy on the second replaceable minimum function entity.
  • the apparatus further includes:
  • the sending module 504 is configured to send a third event notification message to the fourth function entity, where the third event notification message carries an identifier of the first target function entity, so that the fourth function entity is configured according to the first target Identifying a third target functional entity to be repaired, and querying, from the pre-created third correspondence set, the third replaceable minimum functional entity and the third service recovery corresponding to the third target functional entity a policy, and performing the third service recovery policy on the third replaceable minimum functional entity.
  • the sending module 504 is specifically configured to send a third event notification message to the fourth functional entity if the service of the first replaceable minimum functional entity is not successfully restored.
  • the apparatus includes one or more of a NFV infrastructure NFVI, a virtual network function manager VNFM, and a VNF component VNFC/VNF in a network function virtualization NFV network.
  • the obtaining module 501 acquires a first event notification message, where the first event notification message carries an identifier of the first target functional entity to be repaired, and the query module 502 queries the first corresponding relationship set from the pre-created And identifying, by the first target functional entity, the first replaceable minimum functional entity and the first service recovery policy, where the first corresponding relationship set includes the first target functional entity, the first replaceable minimum functional entity, and
  • the first service recovery policy is executed by the processing module 503 to perform the first service recovery policy on the first replaceable minimum function entity, so that the service recovery can be implemented quickly and accurately, and the duration of the service failure is effectively shortened.
  • FIG. 6 is a schematic structural diagram of a service recovery apparatus according to an embodiment of the present application.
  • the service recovery apparatus described in this embodiment includes: a processor 601, a transceiver 602, and a memory 603.
  • the processor 601, the transceiver 602, and the memory 603 can be connected by using a bus or other means.
  • the embodiment of the present application is exemplified by a bus connection.
  • the processor 601 (or Central Processing Unit (CPU)) is a computing core and a control core of the service recovery device.
  • the transceiver 602 can optionally include a standard wired interface, a wireless interface (such as WI-FI, a mobile communication interface, etc.), and is controlled by the processor 601 for transmitting and receiving data.
  • the memory 603 (Memory) is a memory device of the service recovery device for storing programs and data. It can be understood that the memory 603 herein may be a high-speed RAM memory, or may be a non-volatile memory, such as at least one disk memory; optionally, at least one of the processors 601 located away from the foregoing processor 601. Storage device.
  • the memory 603 provides a storage space, which stores an operating system and executable program code of the service recovery device, and may include, but is not limited to, a Windows system (an operating system), a Linux (an operating system) system, and the like.
  • the application is not limited.
  • the processor 601 performs the following operations by running the executable program code in the memory 603:
  • the processor 601 is configured to obtain a first event notification message, where the first event notification message carries an identifier of the first target functional entity to be repaired.
  • the processor 601 is further configured to: query, from a pre-created first correspondence set, a first replaceable minimum functional entity and a first service recovery policy corresponding to the first target functional entity represented by the identifier,
  • the first correspondence set includes a correspondence between the first target function entity, the first replaceable minimum function entity, and the first service recovery policy.
  • the processor 601 is further configured to perform the first service recovery policy on the first replaceable minimum function entity.
  • the first replaceable minimum functional entity is the first target functional entity, and the first target functional entity may be replaced by the network.
  • the first replaceable minimum functional entity is a pre-created association set
  • the second target entity associated with the first target functional entity the first target functional entity is not replaceable by the network
  • the relationship set includes an association relationship between the first target functional entity and the second functional entity.
  • the processor 601 is specifically configured to:
  • the first correspondence relationship set further includes a correspondence between the first target function entity and the target resource redundancy mode.
  • the processor 601 is specifically configured to:
  • the transceiver 602 is configured to: if the processor 601 is unable to perform the first service recovery policy, send a second event notification message to the third function entity, where the second event notification message carries Determining, by the third functional entity, the second target functional entity to be repaired according to the identifier of the first target functional entity, and from the pre-created second correspondence set Querying a second replaceable minimum functional entity and a second service recovery policy corresponding to the second target functional entity, and performing the second service recovery policy on the second replaceable minimum functional entity.
  • the transceiver 602 is further configured to send, to the fourth functional entity, a third event notification message, where the third event notification message carries an identifier of the first target functional entity, such that Determining, by the fourth functional entity, the third target functional entity to be repaired according to the identifier of the first target functional entity, and querying the third corresponding functional entity from the pre-created third corresponding relationship set
  • the minimum functional entity and the third service recovery policy may be replaced, and the third service recovery policy may be performed on the third replaceable minimum functional entity.
  • the transceiver 602 is specifically configured to send a third event notification message to the fourth functional entity if the service of the first replaceable minimum functional entity is not successfully restored.
  • the processor 601, the transceiver 602, and the memory 603 described in the embodiment of the present application may implement the implementation described in the process of a service recovery method provided in FIG. 2, FIG. 3 or FIG. 4 of the embodiment of the present application.
  • the implementation manners described in the service recovery apparatus provided in FIG. 5 of the embodiment of the present application may also be implemented, and details are not described herein again.
  • the processor 601 obtains a first event notification message, where the first event notification message carries an identifier of the first target function entity to be repaired, and queries the identifier of the identifier from the pre-created first correspondence relationship set.
  • the first target functional entity corresponding to the first replaceable minimum functional entity and the first service recovery policy, the first corresponding relationship set includes the first target functional entity, the first replaceable minimum functional entity, and the first The corresponding relationship between the service recovery policies and the execution of the first service recovery policy for the first replaceable minimum functional entity can quickly and accurately restore the service and effectively shorten the duration of the service fault.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transmission to another website site, computer, server or data center by wire (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, microwave, etc.).
  • the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a Solid State Disk (SSD)) or the like.

Abstract

本申请实施例公开了一种业务恢复方法及装置,其中方法包括:第一功能实体获取第一事件通知消息,该第一事件通知消息携带有待修复的第一目标功能实体的标识;该第一功能实体从预先创建的第一对应关系集合中,查询该标识代表的该第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,该第一对应关系集合包括该第一目标功能实体、该第一可被替换最小功能实体和该第一业务恢复策略之间的对应关系;该第一功能实体对该第一可被替换最小功能实体执行该第一业务恢复策略。采用本申请实施例,可以快速、准确地实现业务的恢复,有效缩短业务故障的持续时间。

Description

一种业务恢复方法及装置
本申请要求于2017年1月24日提交中国专利局、申请号为201710057297.9、发明名称为“一种业务恢复方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,尤其涉及一种业务恢复方法及装置。
背景技术
传统的电信网络由多种多样的专用硬件设备组成,不同的网络业务需要不同的硬件设备。随着网络规模的快速增长,电信网络变得越来越庞大和复杂,这也带来了很多问题,例如新增业务的开发上线较为缓慢,系统的运维复杂且成本较高,资源利用率较低等。网络功能虚拟化(Network Function Virtual ization,NFV)技术的提出可以很好的解决上述问题,NFV技术可以理解为将电信网络系统中使用的各个网元的功能从目前的专用硬件平台迁移至通用的商用货架产品(Commercial Off The Shelf,COTS)服务器上,将电信网络中的各个网元转变成独立的应用,并灵活部署在基于标准的服务器、存储以及交换机等设备构建的统一基础设施平台上。虚拟化技术可以将通用的COTS计算/存储/网络等硬件设备转化为上层的虚拟网络功能(Virtualised Network Function,VNF)等各种应用所需的虚拟资源,实现了应用与硬件之间的解耦。
目前,当NFV网络中出现业务故障时,一般都是由业务出现故障的功能实体(例如VNF、NFV基础设施(NFV Infrastructure,NFVI))上层的功能实体捕获故障信息,或者由业务出现故障的功能实体将故障信息上报给上层的功能实体,如果上层的功能实体无法识别故障信息,则需将故障信息继续上报给更上层的功能实体,并最终由运维人员通过分析故障信息判定具体的故障原因,可能还需要结合日志、操作历史等信息才能判定具体的故障原因,进而采取相应的处理措施以恢复故障业务。然而,人工处理业务故障的方式通常会使得业务从出现故障到恢复的耗时较长,并且人工判定故障原因可能会出现误判等情况,这会导致业务故障的持续时间进一步被拉长。可见,上述业务恢复方案耗时较长,准确度偏低。
发明内容
本申请实施例提供了一种业务恢复方法及装置,可以快速、准确地实现业务的恢复,有效缩短业务故障的持续时间。
本申请实施例一方面提供了一种业务恢复方法,应用于包括多个功能实体的网络,该多个功能实体包括第一功能实体,该方法包括:第一功能实体获取第一事件通知消息,该第一事件通知消息可以是其他功能实体发送的,也可以是第一功能实体本地检测到的,该第一事件通知消息携带有待修复的第一目标功能实体的标识,第一功能实体通过查询第一对应关系集合确定第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,并对第一可被替换最小功能实体执行所述第一业务恢复策略,从而可以快速、准确地实现业务的恢 复,有效缩短业务故障的持续时间。
可选的,如果第一目标功能实体可以被网络自动替换,则第一可被替换最小功能实体为第一目标功能实体本身。
或者,
如果第一目标功能实体不能被网络自动替换,则第一可被替换最小功能实体为预先创建的关联关系集合中,第一目标功能实体关联的第二功能实体。
可选的,第一功能实体可以从第一对应关系集合中查询第一目标功能实体对应的资源冗余方式,并具体根据该资源冗余方式对第一可被替换最小功能实体执行第一业务恢复策略。
可选的,该资源冗余方式具体可以为主备冗余、负荷分担冗余、线程池冗余和网络路径冗余中的任意一种。
可选的,第一功能实体判断自身能否对第一可被替换最小功能实体执行第一业务恢复策略,例如可以根据资源的剩余量判断,如果资源的剩余量满足执行第一业务恢复策略的需求,则第一功能实体确定自身可以执行第一业务恢复策略,并对第一可被替换最小功能实体执行第一业务恢复策略。
可选的,该多个功能实体还包括第三功能实体,第一功能实体判断自身能否对第一可被替换最小功能实体执行第一业务恢复策略,例如可以根据资源的剩余量判断,如果资源的剩余量不能满足执行第一业务恢复策略的需求,则第一功能实体确定自身无法执行第一业务恢复策略,则第一功能实体可以向第三功能实体发送第二事件通知消息,第二事件通知消息携带有第一目标功能实体的标识,使得第三功能实体可以根据第一目标功能实体的标识,确定出待修复的第二目标功能实体,并从预先创建的第二对应关系集合中查询第二目标功能实体对应的第二可被替换最小功能实体和第二业务恢复策略,以及对第二可被替换最小功能实体执行第二业务恢复策略,从而在第一功能实体无法进行业务恢复时可以及时通知其他功能实体进行业务恢复,有效缩短业务故障的持续时间。
可选的,该多个功能实体还包括第四功能实体,第一功能实体还可以向第四功能实体发送第三事件通知消息,第三事件通知消息携带有第一目标功能实体的标识,使得第四功能实体可以根据第一目标功能实体的标识,确定出待修复的第三目标功能实体,并从预先创建的第三对应关系集合中查询第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对第三可被替换最小功能实体执行第三业务恢复策略,从而第一功能实体的本地业务恢复后,可以进一步地通知其他功能实体恢复受第一目标功能实体影响的业务,以及还可以恢复资源冗余的正常状态。
可选的,第一功能实体在执行第一业务恢复策略后,可以判断第一可被替换最小功能实体的业务是否恢复成功,如果没有恢复成功,则第一功能实体执行向第四功能实体发送第三事件通知消息的步骤,可以在第一功能实体执行业务恢复不成功时及时通知其他功能实体进行业务恢复处理,可以有效缩短业务故障的持续时间。
本申请实施例另一方面提供了一种业务恢复装置,应用于包括多个功能实体的网络,该多个功能实体包括该业务恢复装置,该装置包括:获取模块、查询模块、处理模块和发送模块,上述各个模块用于执行上述各方面所述的方法。
本申请实施例又一方面提供了一种业务恢复装置,包括:处理器、收发器和存储器,该处理器、该收发器和该存储器通过总线连接,该存储器存储有可执行程序代码,该收发器受 该处理器的控制用于收发消息,该处理器用于调用该可执行程序代码,执行上述各方面所述的方法。
本申请实施例又一方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请实施例又一方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请实施例中,第一功能实体获取第一事件通知消息,该第一事件通知消息携带有待修复的第一目标功能实体的标识,该第一功能实体从预先创建的第一对应关系集合中,查询该标识代表的该第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,并对该第一可被替换最小功能实体执行该第一业务恢复策略,从而可以快速、准确地实现业务的恢复,有效缩短业务故障的持续时间。
附图说明
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。
图1是本申请实施例提供的一种NFV网络的架构示意图;
图2是本申请实施例提供的一种业务恢复方法的流程示意图;
图3是本申请实施例提供的另一种业务恢复方法的流程示意图;
图4是本申请实施例提供的又一种业务恢复方法的流程示意图;
图5是本申请实施例提供的一种业务恢复装置的结构示意图;
图6是本申请实施例提供的一种业务恢复装置的结构示意图。
具体实施方式
下面结合本申请实施例中的附图对本申请实施例进行描述。
请参阅图1,为本申请实施例提供的一种NFV网络的架构示意图。本实施例中所描述的NFV网络,具体可以是数据中心网络、服务提供者网络、或者局域网(Local Area Network,LAN)等,可以包括以下功能实体:NFV管理和编排(NFV Management and Orchestration,NFV-MANO)128,NFV基础设施(NFV Infrastructure,NFVI)130,多个虚拟网络功能(VNF)108,以及多个网元管理系统(Element Management System,EMS)122。
在一些可行的实施方式中,在此基础上,本实施例中所描述的NFV网络还可以包括以下功能实体:服务、VNF和基础设施描述(Service,VNF and Infrastructure Description)126,以及一个或多个运营支撑系统/业务支撑系统(Operational Support System/Business Support System,OSS/BSS)124。
需要说明的是,图1所示的NFV网络的架构只是一个示例,具体也可以是其他架构形式的NFV网络,本申请实施例不做具体限定。其中:
NFV-MANO128可以包括NFV编排器(NFV Orchestrator,NFVO)102、一个或多个VNF管理器(VNF Manager,VNFM)104,以及一个或多个虚拟基础设施管理器(Virtualized Infrastructure Manager,VIM)106。NFVI130可以包括计算硬件112、存储硬件 114、网络硬件116组成的硬件资源层、虚拟化层、以及虚拟计算110(例如虚拟机(Virtual Machine,VM))、虚拟存储118和虚拟网络120组成的虚拟资源层。计算硬件112可以为专用的处理器或通用的用于提供处理和计算功能的处理器。存储硬件114用于提供存储能力,该存储能力可以是存储硬件114本身提供的(例如一台服务器的本地内存),也可以通过网络提供(例如服务器通过网络连接一个网络存储设备)。网络硬件116可以是交换机、路由器和/或其他网络设备,网络硬件116用于实现多个设备之间的通信,多个设备之间通过无线或有线连接。NFVI130中的虚拟化层用于抽象硬件资源层的硬件资源,将VNF108和硬件资源所属的物理层解耦,向VNF108提供虚拟资源。虚拟资源可以包括虚拟计算110、虚拟存储118和虚拟网络120。虚拟计算110、虚拟存储118可以以虚拟机或其他虚拟容器的形式向VNF108提供,例如VNF108可以部署在虚拟机或其他虚拟容器上。虚拟化层通过抽象网络硬件116形成虚拟网络120。虚拟网络120,例如虚拟交换机(例如vSwitches),用于实现多个虚拟机之间,或多个承载VNF的其他类型的虚拟容器之间的通信。网络硬件的虚拟化可以通过虚拟LAN(Vritual LAN,VLAN)、虚拟专用局域网业务(Virtual Private LAN Service,VPLS)、虚拟可扩展局域网(Virtual eXtensible Local Area Network,VxLAN)或通用路由封装网络虚拟化(Nerwork Virtualization using Generic Routing Encapsulation,NVGRE)等技术实现。OSS/BSS124主要面向电信服务运营商,提供综合的网络管理和业务运营功能,包括网络管理(例如故障监控、网络信息收集等)、计费管理以及客户服务管理等。VNF和基础设施描述126在ETSI GS NFV 002v1.1.1标准中有详细介绍,本申请实施例在此不再赘述。
NFV-MANO128可以用于实现VNF108和NFVI130的监控和管理。NFVO102可以与一个或多个VNFM104通信以实现与资源相关的请求、发送配置信息给VNFM104、以及收集VNF108的状态信息。另外,NFVO102还可以与VIM106进行通信以实现资源分配,和/或实现虚拟化硬件资源的配置信息和状态信息的预留和交换。VNFM104可以用于管理一个或多个VNF108,执行各种管理功能,例如初始化、更新、查询、和/或终止VNF108。VIM106可以用于控制和管理VNF108和计算硬件112、存储硬件114、网络硬件116、虚拟计算110、虚拟存储118、虚拟网络120的交互。例如,VIM106可以用于执行资源向VNF108的分配操作。VNFM104和VIM106可以互相通信以交换虚拟化硬件资源配置和状态信息。
需要说明的是,NFV-MANO可以部署在通用的物理网络设备或者物理服务器上;也可以部署在VM上,本申请实施例不做限定。
NFVI130包含硬件和软件,二者共同建立虚拟化环境以部署、管理和执行VNF108。换句话说,硬件资源层和虚拟资源层用于向VNF108提供虚拟资源,例如虚拟机和/或其他形式的虚拟容器。
VNFM104可以与VNF108和EMS122通信以执行VNF生命周期管理和实现配置信息/状态信息的交换。VNF108是至少一个网络功能的虚拟化,该网络功能之前是由物理网络设备提供的。EMS122可以用于管理一个或多个VNF108,多个VNF108一起组成网络业务(Network Service,NS),提供给用户使用,每个VNF108上可以运行一个或者多个VNF组件(VNF Component,VNFC),VNF108相当于网络节点的实体,即网元(Network Element,NE)。
其中,本申请实施例创建有用于确定可被替换最小功能实体的对应关系集合,该对应关系集合包括待修复的功能实体与可被替换最小功能实体和业务恢复策略之间的对应关系,通 过查询该对应关系集合可以快速得到待修复的功能实体对应的可被替换最小功能实体和相应的业务恢复策略,该对应关系集合具体可以在构建NFV网络时配置。当然也可以在其他可行的时间配置该对应关系集合,本申请实施例不做具体限定。
其中,可被替换最小功能实体具有可以被NFV网络自动替换以及受影响最小等特点,当某个故障发生时,只要将可被替换最小功能实体替换掉就可以快速恢复业务。当待修复的功能实体可以被NFV网络自动替换,则可被替换最小功能实体即为待修复的功能实体本身,当待修复的功能实体不能被NFV网络自动替换,则可被替换最小功能实体为待修复的功能实体关联的功能实体,包括待修复的功能实体的上层功能实体。例如,中央处理器(Central Processing Unit,CPU)出现故障,即CPU为待修复的功能实体,但是由于CPU无法被NFV网络自动替换,CPU所在的主机/服务器也无法被NFV网络自动替换,则此种情况下的可被替换最小功能实体为CPU所在的主机/服务器上运行的VM或其他形式的虚拟容器。业务恢复策略即为对可被替换最小功能实体的业务进行恢复时采用的具体实施方案。
进一步地,用于确定可被替换最小功能实体的对应关系集合还可以包括待修复的功能实体与资源冗余方式之间的对应关系。资源冗余方式为针对各功能实体设置冗余资源的形式,可以包括主备冗余、负荷分担冗余、线程池冗余或网络路径冗余等形式,查询该对应关系集合得到待修复的功能实体对应的资源冗余方式,并具体按照查询到的资源冗余方式,针对可被替换最小功能实体执行相应的业务恢复策略。
其中,本申请实施例创建有用于确定关联的功能实体的关联关系集合,该关联关系集合包括各功能实体之间的关联关系,可以包括主机与VM之间的关联关系、VM与VNFC之间的关联关系以及VNF与VNFC之间的关联关系等。其中,主机与VM之间的关联关系即一台主机上具体运行的VM,可以是一台主机上运行一个或者多个VM。VM与VNFC之间的关联关系即一个VM上具体运行的VNFC,可以是一个VM上运行一个VNFC。VNF与VNFC之间的关联关系即一个VNF具体包括的VNFC,可以是一个VNF包括一个或者多个VNFC。该关联关系集合具体可以在构建NFV网络时生成,并在NFV网络运行过程中进行更新维护,以保证该关联关系集合可以反映出最新的各功能实体之间的关联关系。当然也可以在其他可行的时间生成该关联关系集合,本申请实施例不做具体限定。
需要说明的是,本申请实施例创建的用于确定可被替换最小功能实体的对应关系集合以及用于确定关联的功能实体的关联关系集合可以分布式地存储在各个功能实体上,也可以集中存储在可以被各功能实体全局访问的指定功能实体上,本申请实施例对此不作限定。
具体实现中,在NFV网络中出现业务已经受到影响或者预测业务可能会受到影响或者运维人员发出运维指示等情况时,可以生成事件通知消息,NFV网络中的功能实体可以本地监测到该事件通知消息,或者接收其他功能实体发送的该事件通知消息,其他功能实体发送该事件通知消息包括:下层功能实体上报该事件通知消息,或者上层功能实体下发该事件通知消息,或者同层功能实体转发该事件通知消息。功能实体获取到该事件通知消息后,根据该事件通知消息从上述多个功能实体中确定出待修复的功能实体,并从用于确定可被替换最小功能实体的对应关系集合中,查询待修复的功能实体对应的可被替换最小功能实体和相应的业务恢复策略,并对待修复的功能实体对应的可被替换最小功能实体执行相应的业务恢复策略,从而可以在业务受损时快速地恢复业务,以及预测业务可能受到的影响,提前进行规避,以保证业务的稳定运行。
其中,用于确定可被替换最小功能实体的对应关系集合可以包括表1所示的待修复的功能实体、可被替换最小功能实体和业务恢复策略之间的对应关系;也可以包括待修复的功能实体、可被替换最小功能实体、资源冗余方式和业务恢复策略之间的对应关系。表1为一个示例,对应关系还可以包括其他,对应关系中各项的内容也不限于表1中所示,本申请实施例不做具体限定。
Figure PCTCN2018072909-appb-000001
Figure PCTCN2018072909-appb-000002
表1
其中,用于确定关联的功能实体的关联关系集合可以包括但不限于表2、表3、表4所示的关联关系,表2为主机与VM的关联关系,即一个主机上运行的VM,表3为VM与VNFC的关联关系,即一个VM上运行的VNFC,表4为VNF与VNFC的关联关系,即一个VNF包括的VNFC。
主机 VM
HOST1 VM-11
HOST1 VM-12
HOST1 VM-13
HOST2 VM-21
HOST2 VM-22
…… ……
表2
VM VNFC
VM-11 VNFC-11
VM-12 VNFC-21
VM-13 VNFC-31
…… ……
表3
VNF VNFC
VNF1 VNFC-11
VNF1 VNFC-12
VNF1 VNFC-13
VNF2 VNFC-21
VNF2 VNFC-22
…… ……
表4
请参阅图2,为本申请实施例提供的一种业务恢复方法的流程示意图,该方法可以用于图1所示的NFV网络中,也可以用于其他网络中。本实施例中所描述的业务恢复方法,应用于包括多个功能实体的网络,该多个功能实体包括第一功能实体和第四功能实体。下面以NFV网络为例来说明,第一功能实体具体为NFV网络中的虚拟机监视器Hypervisor,第四功能实体具体为NFV网络中的VNFM,该方法包括:
201、Hypervisor获取第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识。
具体实现中,Hypervisor监视网络接口卡(Network Interface Card,NIC)的运行状态,当NIC的运行出现故障时,Hypervisor可以获取到第一事件通知消息(即NIC故障的消息),该第一事件通知消息携带有待修复的第一目标功能实体(即故障NIC)的标识,Hypervisor根据该标识即可确定该待修复的第一目标功能实体为该故障NIC。
202、所述Hypervisor从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略。
其中,该第一对应关系集合包括该第一目标功能实体、该第一可被替换最小功能实体和该第一业务恢复策略之间的对应关系。
具体实现中,在该待修复的第一目标功能实体为故障NIC时,Hypervisor通过查询该第一对应关系集合(如表1所示),确定该故障NIC对应的第一可被替换最小功能实体为该故障NIC,对应的第一业务恢复策略为Hypervisor将该故障NIC从绑定中删除。
203、所述Hypervisor对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
具体实现中,在一些可行的实施方式中,该第一对应关系集合还包括该第一目标功能实体和目标资源冗余方式之间的对应关系。Hypervisor从该第一对应关系集合(如表1所示)中查询该故障NIC对应的目标资源冗余方式为绑定,则Hypervisor根据目标资源冗余方式为绑定而将该故障NIC从绑定中删除,以执行该第一业务恢复策略。例如,NIC1和NIC2为绑定关系,NIC1和NIC2的资源冗余方式均为绑定,NIC1和NIC2均正常时,业务流量由NIC1和NIC2共同分担,当NCI1出现故障时,则Hypervisor将NIC1从绑定中删除,即隔离故障的NIC1,此时原来NIC1和NIC2共同分担的业务流量全部由NIC2承担。
204、所述Hypervisor向VNFM发送第三事件通知消息,所述VNFM接收所述第三事件 通知消息,所述第三事件通知消息携带有所述第一目标功能实体的标识。
在一些可行的实施方式中,Hypervisor将该故障NIC从绑定中删除后可以快速恢复受该故障NIC影响的业务,但此时NFV网络还没有恢复资源冗余的正常状态,即NIC之间形成绑定的状态,则Hypervisor可以通知上层功能实体(例如VNFM)作进一步处理。具体包括:Hypervisor可以通过VIM向VNFM发送第三事件通知消息,即Hypervisor先向VIM发送该第三事件通知消息,再由VIM转发给VNFM,该第三事件通知消息携带有该故障NIC的标识,VNFM接收该第三事件通知消息。当然,在一些可行的实施方式中,Hypervisor也可以直接向VNFM发送该第三事件通知消息。
205、所述VNFM根据所述第一目标功能实体的标识,确定出待修复的第三目标功能实体。
具体实现中,在一些可行的实施方式中,VNFM根据该故障NIC的标识确定出受该故障NIC影响的VM标识,或者,VIM接收到该第三事件通知消息后,根据该故障NIC的标识确定出受该故障NIC影响的VM标识,并将该VM标识发送给VNFM,VNFM根据用于确定关联的功能实体的关联关系集合(如表3所示),确定出待修复的第三目标功能实体(即该VM标识对应的VM所关联的VNFC)。
206、所述VNFM从预先创建的第三对应关系集合中查询所述第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对所述第三可被替换最小功能实体执行所述第三业务恢复策略。
具体实现中,在一些可行的实施方式中,VNFM通过查询第三对应关系集合(如表1所示),确定该VNFC对应的第三可被替换最小功能实体为该VNFC,对应的第三业务恢复策略为VNFM根据VNFC的资源冗余方式进行业务恢复处理。
其中,如果VNFC的资源冗余方式为主备,则VNFM进行主备切换,即VNFM下发配置给原来的备VNFC,备VNFC所在的VM不受该故障NIC的影响,使得原来的备VNFC切换升级为新的主VNFC,以及VNFM下发配置给原来的主VNFC的前向(即上一级)VNFC,使得前向VNFC将业务流量由原来的主VNFC切换为新的主VNFC,实现对原来的主VNFC的隔离。如果VNFC的资源冗余方式为负荷分担,则VNFM进行该VNFC的隔离,即VNFM下发配置给该VNFC的前向VNFC,使得前向VNFC切断到该VNFC的业务流量,实现对该VNFC的隔离。
进一步地,VNFM对该VNFC执行完业务恢复处理后,还需要对该VNFC进行修复,由于是NIC出现故障,重启该VNFC无法完成修复,因此可以直接将该VNFC迁移实现对该VNFC的修复,并由VIM将该VNFC迁移到不受故障NIC影响的VM上去,将VNFC迁移后NFV网络恢复资源冗余的正常状态,即NIC之间形成绑定的状态。
在一些可行的实施方式中,Hypervisor对该故障NIC执行第一业务恢复策略之前,可以先判断当前主机上冗余资源(即功能正常的NIC)的数量是否能够满足执行第一业务恢复策略的需求,如果能够满足,则Hypervisor本地对该故障NIC执行第一业务恢复策略,如果不能满足,则Hypervisor可以通知上层功能实体(例如VNFM)进行业务恢复处理。例如,当前主机上有4个NIC(NIC1、NIC2、NIC3和NIC4),两两之间组成绑定关系,NCI1与NIC2绑定,NIC3和NIC4绑定,如果Hypervisor获取到的第一事件通知消息为其中3个NCI(假设NIC1、NIC2和NIC3)均出现故障,由于功能正常的NCI只有1个(即NIC4),Hypervisor 无法对NIC1、NIC2和NIC3都执行相应的业务恢复策略,则此时可对应表1中待修复的功能实体为主机的所有NIC的情况,即可被替换最小功能实体需上升为当前主机上的VM,需将VM上运行的所有VNFC迁移到其他主机上去,Hypervisor可以通知VIM执行VNFC向其他主机的迁移。
需要说明的是,用于确定可被替换最小功能实体的对应关系集合可以采用分布式存储,即各功能实体本地存储有对应关系集合,各功能实体本地存储的对应关系集合可以互不相同,可以只包括表1中的部分对应关系,具体可以是与自身有关联的对应关系,例如,Hypervisor存储的对应关系集合可以只包括表1中待修复的功能实体为NIC的对应关系,VNFM存储的对应关系集合可以只包括表1中待修复的功能实体为VNFC的对应关系、VM的对应关系、VNF的对应关系,VNFC存储的对应关系集合可以只包括表1中待修复的功能实体为VNFC线程的对应关系、链路包括的实体的对应关系,VIM存储的对应关系集合可以只包括表1中待修复的功能实体为vNIC的对应关系、vSwitch的对应关系,等等。当然,用于确定可被替换最小功能实体的对应关系集合也可以采用集中式存储,则本申请实施例中的该第一对应关系集合、该第三对应关系集合等可以是同一个对应关系集合,包括表1中的全部对应关系,该对应关系集合存储在可以被各功能实体全局访问的指定功能实体上,各功能实体可以通过访问该指定功能实体查询可被替换最小功能实体和相应的业务恢复策略。
本申请实施例中,Hypervisor监测到NIC故障时,查询故障NIC对应的可被替换最小功能实体(即该故障NIC)以及对应的业务恢复策略,按照对应的业务恢复策略将该故障NIC从绑定中删除,可以快速恢复受该故障NIC影响的业务。进一步地,Hypervisor可以通知VNFM对受故障NIC影响的可被替换最小功能实体VNFC进行业务恢复处理,从而实现了分层快速、准确地恢复业务,有效缩短业务故障的持续时间,并且快速地恢复了NIC资源冗余的正常状态。
请参阅图3,为本申请实施例提供的另一种业务恢复方法的流程示意图。,该方法可以用于图1所示的NFV网络中,也可以用于其他网络中。本实施例中所描述的业务恢复方法,应用于包括多个功能实体的网络,该多个功能实体包括第一功能实体和第四功能实体。下面以NFV网络为例来说明,第一功能实体具体为NFV网络中的VNFC,第四功能实体具体为NFV网络中的VNFM,该方法包括:
301、VNFC获取第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识。
具体实现中,VNFC可以通过可靠性监控程序监测各VNFC线程的运行状态,当监测到有VNFC线程出现故障时,VNFC可以获取到第一事件通知消息(即VNFC线程故障的消息),该第一事件通知消息携带有待修复的第一目标功能实体(即故障VNFC线程)的标识,根据该标识即可确定该待修复的第一目标功能实体为该故障VNFC线程。
302、所述VNFC从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略。
其中,该第一对应关系集合包括该第一目标功能实体、该第一可被替换最小功能实体和该第一业务恢复策略之间的对应关系。
具体实现中,在该待修复的第一目标功能实体为故障VNFC线程时,VNFC内部的可靠 性模块可以通过查询该第一对应关系集合(如表1所示),确定该故障VNFC对应的第一可被替换最小功能实体为该故障VNFC,对应的第一业务恢复策略为VNFC用线程池中的Active态VNFC线程替换故障VNFC线程。
303、所述VNFC对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
具体实现中,在一些可行的实施方式中,该第一对应关系集合还包括该第一目标功能实体和目标资源冗余方式之间的对应关系。VNFC从该第一对应关系集合(如表1所示)中查询该故障VNFC线程对应的目标资源冗余方式为线程池,则VNFC根据目标资源冗余方式为线程池而用线程池中的Active态VNFC线程替换故障的VNFC线程。
在一些可行的实施方式中,VNFC对该第一可被替换最小功能实体执行该第一业务恢复策略之后,可以判断该第一业务恢复策略是否执行成功,即VNFC判断Active态VNFC线程是否能够正常处理故障VNFC线程对应的业务而使得业务恢复成功,如果是,则本次流程结束;如果否,则表明用线程池中的Active态VNFC线程替换故障的VNFC线程后业务并没有恢复,则VNFC可以通知上层功能实体(例如EM、VNFM)作进一步处理,具体的可以通过在上述步骤301~303之后进一步执行如下步骤304~306来实现。
304、若所述第一可被替换最小功能实体的业务未恢复成功,则所述VNFC向VNFM发送第三事件通知消息,所述VNFM接收所述第三事件通知消息,所述第三事件通知消息携带有所述第一目标功能实体的标识。
具体实现中,在一些可行的实施方式中,如果第一可被替换最小功能实体的业务未恢复成功,则VNFC可以通过EM向VNFM发送第三事件通知消息,即VNFC先向EM发送该第三事件通知消息,再由EM转发给VNFM,该第三事件通知消息携带有该故障VNFC线程对应的VNFC的标识,VNFM接收该第三事件通知消息。
305、所述VNFM根据所述第一目标功能实体的标识,确定出待修复的第三目标功能实体。
具体实现中,在一些可行的实施方式中,VNFM根据该VNFC的标识确定出待修复的第三目标功能实体(即该故障VNFC线程对应的VNFC),或者,EM接收到该第三事件通知消息后,根据该VNFC的标识确定出待修复的第三目标功能实体(即该故障VNFC线程对应的VNFC),并将该第三目标功能实体的标识发送给VNFM,从而VNFM获取到待修复的第三目标功能实体为该故障VNFC线程对应的VNFC。
306、所述VNFM从预先创建的第三对应关系集合中查询所述第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对所述第三可被替换最小功能实体执行所述第三业务恢复策略。
具体实现中,在一些可行的实施方式中,VNFM通过查询第三对应关系集合(如表1所示),确定该VNFC对应的第三可被替换最小功能实体为该VNFC,对应的第三业务恢复策略为VNFM根据VNFC的资源冗余方式进行业务恢复处理。
其中,VNFC的资源冗余方式可以为主备或者负荷分担,VNFM根据VNFC的资源冗余方式对该VNFC进行业务恢复处理的具体方式可以参见图2所示的方法实施例步骤206中的描述,此处不再赘述。
进一步地,VNFM对该VNFC执行完业务恢复处理后,还需要对该VNFC进行修复,可以先重启该VNFC进行修复,当重启无法完成修复时,再通知VIM将该VNFC进行迁移实 现对该VNFC的修复,可以减少资源浪费。
在一些可行的实施方式中,如果EM具备对VNFC进行业务恢复处理的能力,则EM可以本地根据VNFC的资源冗余方式对该VNFC进行业务恢复处理,在业务恢复处理完毕后,再通知VNFM对该VNFC进行修复,可以实现更加精细化的分层进行业务恢复处理,进一步减小对业务的影响。
在一些可行的实施方式中,在线程池中的VNFC线程被取用后,可以为线程池增加具备相应功能的VNFC线程,从而保证NFV网络恢复资源冗余的正常状态,即VNFC的线程池冗余。
本申请实施例中,VNFC监测到内部VNFC线程故障时,查询该故障VNFC线程对应的可被替换最小功能实体(即该故障VNFC线程)以及对应的业务恢复策略,按照对应的业务恢复策略用线程池中的Active态VNFC线程替换该故障VNFC线程,可以快速恢复受该故障VNFC线程影响的业务。进一步地,如果该故障VNFC线程被替换后,对应的业务未恢复成功,则VNFC可以通知VNFM对该故障VNFC线程对应的VNFC进行业务恢复处理,从而实现了分层快速、准确地恢复业务,避免直接重启VNFC线程导致的业务故障的持续时间较长,或者直接重启VNFC导致的受影响的业务范围较大,并且可以恢复VNFC线程资源冗余的正常状态。
请参阅图4,为本申请实施例提供的又一种业务恢复方法的流程示意图,该方法可以用于图1所示的NFV网络中,也可以用于其他网络中。本实施例中所描述的业务恢复方法,应用于包括多个功能实体的网络,该多个功能实体包括第一功能实体。下面以NFV网络为例来说明,第一功能实体具体为NFV网络中的VNFM,该方法包括:
401、NFVI发送风扇故障告警,OSS/BSS接收所述风扇故障告警。
具体实现中,在主机风扇出现故障后,NFVI可以检测到CPU温度持续升高,并可以通过VIM和NFVO向OSS/BSS发送风扇故障预警,OSS/BSS接收该风扇故障预警。具体可以是NFVI将风扇故障预警发送给VIM,VIM根据表1所示的对应关系集合确定CPU对应的可被替换最小功能实体为主机承载的VMs,VIM再根据表2所示的关联关系集合确定受影响的VM列表,VIM将主机标识和VM列表发送给NFVO,由NFVO转发给OSS/BSS,OSS/BSS判定需要对该主机标识对应的主机进行维护。
402、所述OSS/BSS发送第一事件通知消息,所述VNFM接收所述第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识。
具体实现中,在维护之前,需要将该主机上的业务进行迁移,OSS/BSS向VNFM发送第一事件通知消息,VNFM接收该第一事件通知消息,该第一事件通知消息携带有待修复的第一目标功能实体(即VM)的标识,具体可以是由OSS/BSS先向NFVO发送该第一事件通知消息,由NFVO识别出各个VNFM需要修复的VM,再由NFVO向各个VNFM发送携带有待修复的第一目标功能实体(即VM)标识的第一事件通知消息。
需要说明的是,这里是在NFVI发出风扇故障预警(即业务并没有受影响)的情况下,上层功能实体(OSS/BSS、NFVO)向VNFM发送事件通知消息进行业务恢复处理,当然,也可以是在实际业务已经受到影响的情况下,上层功能实体(OSS/BSS、NFVO)向VNFM发送事件通知消息进行业务恢复处理,还可以是在运维人员需要进行运维操作时通过 OSS/BSS向VNFM发送事件通知消息进行业务恢复处理。
403、所述VNFM从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略。
其中,该第一对应关系集合包括该第一目标功能实体、该第一可被替换最小功能实体和该第一业务恢复策略之间的对应关系。
具体实现中,VNFM首先可以根据用于确定关联的功能实体的关联关系集合(如表3所示),确定出待修复的第一目标功能实体关联的功能实体,即该VM标识对应的VM所关联的VNFC,VNFM通过查询第一对应关系集合(如表1所示),确定该VNFC对应的第一可被替换最小功能实体为该VNFC,对应的第一业务恢复策略为VNFM根据VNFC的资源冗余方式进行业务恢复处理。
404、所述VNFM对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
其中,VNFC的资源冗余方式可以为主备或者负荷分担,VNFM根据VNFC的资源冗余方式对该VNFC进行业务恢复处理的具体方式可以参见图2所示的方法实施例步骤206中的描述,此处不再赘述。
进一步地,VNFM对该VNFC执行完业务恢复处理后,还需要对该VNFC进行修复,由于是主机的风扇出现故障预警,此时需要对主机的业务进行迁移,从而直接将该VNFC进行迁移实现对该VNFC的修复,并由VIM将该VNFC迁移到其他主机的VM上去。
进一步地,在所有VNFC修复成功后,VNFM向NFVO发送VNFC修复成功的通知,NFVO在接收到各个VNFM都发送所有VNFC修复成功的通知时,NFVO向OSS/BSS发送业务恢复成功的通知,否则,NFVO向OSS/BSS发送业务恢复失败的通知。在NFVO发送业务恢复成功的通知时,OSS/BSS可以派工单维护风扇,包括对风扇进行维修或者更换;在NFVO发送业务恢复失败的通知时,OSS/BSS可以生成告警信息,由人工介入处理。
本申请实施例中,在出现风扇等硬件故障而业务并没有受影响的情况或者运维人员需要进行运维操作时,可以提前进行预警,由上层功能实体通知VNFM进行硬件维护前的预处理(即业务迁移操作),实现在故障对业务产生影响之前就可以进行业务迁移等恢复处理,有效避免了硬件故障对业务产生的影响。
请参阅图5,为本申请实施例提供的一种业务恢复装置的结构示意图。本实施例中所描述的业务恢复装置包括:
获取模块501,用于获取第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识。
查询模块502,用于从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,所述第一对应关系集合包括所述第一目标功能实体、所述第一可被替换最小功能实体和所述第一业务恢复策略之间的对应关系。
处理模块503,用于对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
在一些可行的实施方式中,所述第一可被替换最小功能实体为所述第一目标功能实体,所述第一目标功能实体可被所述网络替换。
或者,
所述第一可被替换最小功能实体为预先创建的关联关系集合中,所述第一目标功能实体关联的第二功能实体,所述第一目标功能实体不可被所述网络替换,所述关联关系集合包括所述第一目标功能实体和所述第二功能实体之间的关联关系。
在一些可行的实施方式中,所述处理模块503,包括:
查询单元5030,用于从所述第一对应关系集合中查询所述第一目标功能实体对应的目标资源冗余方式。
执行单元5031,用于根据所述目标资源冗余方式对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
其中,所述第一对应关系集合还包括所述第一目标功能实体和所述目标资源冗余方式之间的对应关系。
在一些可行的实施方式中,所述处理模块503,具体用于:
若可执行所述第一业务恢复策略,则对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
在一些可行的实施方式中,所述处理模块503,具体用于:
若无法执行所述第一业务恢复策略,则向第三功能实体发送第二事件通知消息,所述第二事件通知消息携带有所述第一目标功能实体的标识,使得所述第三功能实体根据所述第一目标功能实体的标识,确定出待修复的第二目标功能实体,并从预先创建的第二对应关系集合中查询所述第二目标功能实体对应的第二可被替换最小功能实体和第二业务恢复策略,以及对所述第二可被替换最小功能实体执行所述第二业务恢复策略。
在一些可行的实施方式中,所述装置还包括:
发送模块504,用于向第四功能实体发送第三事件通知消息,所述第三事件通知消息携带有所述第一目标功能实体的标识,使得所述第四功能实体根据所述第一目标功能实体的标识,确定出待修复的第三目标功能实体,并从预先创建的第三对应关系集合中查询所述第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对所述第三可被替换最小功能实体执行所述第三业务恢复策略。
在一些可行的实施方式中,所述发送模块504,具体用于若所述第一可被替换最小功能实体的业务未恢复成功,则向所述第四功能实体发送第三事件通知消息。
在一些可行的实施方式中,所述装置包括网络功能虚拟化NFV网络中的NFV基础设施NFVI、虚拟网络功能管理器VNFM和VNF组件VNFC/VNF中的一种或多种。
需要说明的是,本申请实施例的业务恢复装置的各功能模块、单元的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
本申请实施例中,获取模块501获取第一事件通知消息,该第一事件通知消息携带有待修复的第一目标功能实体的标识,查询模块502从预先创建的第一对应关系集合中,查询该标识代表的该第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,该第一对应关系集合包括该第一目标功能实体、该第一可被替换最小功能实体和该第一业务恢复策略之间的对应关系,处理模块503对该第一可被替换最小功能实体执行该第一业务恢复策略,可以快速、准确地实现业务的恢复,有效缩短业务故障的持续时间。
请参阅图6,为本申请实施例提供的一种业务恢复装置的结构示意图。本实施例中所描述的业务恢复装置包括:处理器601、收发器602及存储器603。其中,处理器601、收发器602及存储器603可通过总线或其他方式连接,本申请实施例以通过总线连接为例。
其中,处理器601(或称中央处理器(Central Processing Unit,CPU))是业务恢复装置的计算核心以及控制核心。收发器602可选的可以包括标准的有线接口、无线接口(如WI-FI、移动通信接口等),受处理器601的控制用于收发数据。存储器603(Memory)是业务恢复装置的记忆设备,用于存放程序和数据。可以理解的是,此处的存储器603可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的还可以是至少一个位于远离前述处理器601的存储装置。存储器603提供存储空间,该存储空间存储了业务恢复装置的操作系统和可执行程序代码,可包括但不限于:Windows系统(一种操作系统)、Linux(一种操作系统)系统等等,本申请对此并不作限定。
在本申请实施例中,处理器601通过运行存储器603中的可执行程序代码,执行如下操作:
处理器601,用于获取第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识。
所述处理器601,还用于从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,所述第一对应关系集合包括所述第一目标功能实体、所述第一可被替换最小功能实体和所述第一业务恢复策略之间的对应关系。
所述处理器601,还用于对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
在一些可行的实施方式中,所述第一可被替换最小功能实体为所述第一目标功能实体,所述第一目标功能实体可被所述网络替换。
或者,
所述第一可被替换最小功能实体为预先创建的关联关系集合中,所述第一目标功能实体关联的第二功能实体,所述第一目标功能实体不可被所述网络替换,所述关联关系集合包括所述第一目标功能实体和所述第二功能实体之间的关联关系。
在一些可行的实施方式中,所述处理器601,具体用于:
从所述第一对应关系集合中查询所述第一目标功能实体对应的目标资源冗余方式。
根据所述目标资源冗余方式对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
其中,所述第一对应关系集合还包括所述第一目标功能实体和所述目标资源冗余方式之间的对应关系。
在一些可行的实施方式中,所述处理器601,具体用于:
若可执行所述第一业务恢复策略,则对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
在一些可行的实施方式中,收发器602,用于若处理器601无法执行所述第一业务恢复策略,则向第三功能实体发送第二事件通知消息,所述第二事件通知消息携带有所述第一目标功能实体的标识,使得所述第三功能实体根据所述第一目标功能实体的标识,确定出待修复的第二目标功能实体,并从预先创建的第二对应关系集合中查询所述第二目标功能实体对 应的第二可被替换最小功能实体和第二业务恢复策略,以及对所述第二可被替换最小功能实体执行所述第二业务恢复策略。
在一些可行的实施方式中,所述收发器602,还用于向第四功能实体发送第三事件通知消息,所述第三事件通知消息携带有所述第一目标功能实体的标识,使得所述第四功能实体根据所述第一目标功能实体的标识,确定出待修复的第三目标功能实体,并从预先创建的第三对应关系集合中查询所述第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对所述第三可被替换最小功能实体执行所述第三业务恢复策略。
在一些可行的实施方式中,所述收发器602,具体用于若所述第一可被替换最小功能实体的业务未恢复成功,则向所述第四功能实体发送第三事件通知消息。
具体实现中,本申请实施例中所描述的处理器601、收发器602及存储器603可执行本申请实施例图2、图3或图4提供的一种业务恢复方法的流程中所描述的实现方式,也可执行本申请实施例图5提供的一种业务恢复装置中所描述的实现方式,在此不再赘述。
本申请实施例中,处理器601获取第一事件通知消息,该第一事件通知消息携带有待修复的第一目标功能实体的标识,从预先创建的第一对应关系集合中,查询该标识代表的该第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,该第一对应关系集合包括该第一目标功能实体、该第一可被替换最小功能实体和该第一业务恢复策略之间的对应关系,并对该第一可被替换最小功能实体执行该第一业务恢复策略,可以快速、准确地实现业务的恢复,有效缩短业务故障的持续时间。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。
综上,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (17)

  1. 一种业务恢复方法,应用于包括多个功能实体的网络,所述多个功能实体包括第一功能实体,其特征在于,所述方法包括:
    所述第一功能实体获取第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识;
    所述第一功能实体从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,所述第一对应关系集合包括所述第一目标功能实体、所述第一可被替换最小功能实体和所述第一业务恢复策略之间的对应关系;
    所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
  2. 根据权利要求1所述的方法,其特征在于,
    所述第一可被替换最小功能实体为所述第一目标功能实体,所述第一目标功能实体可被所述网络替换;
    或者,
    所述第一可被替换最小功能实体为预先创建的关联关系集合中,所述第一目标功能实体关联的第二功能实体,所述第一目标功能实体不可被所述网络替换,所述关联关系集合包括所述第一目标功能实体和所述第二功能实体之间的关联关系。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略,包括:
    所述第一功能实体从所述第一对应关系集合中查询所述第一目标功能实体对应的目标资源冗余方式;
    所述第一功能实体根据所述目标资源冗余方式对所述第一可被替换最小功能实体执行所述第一业务恢复策略;
    其中,所述第一对应关系集合还包括所述第一目标功能实体和所述目标资源冗余方式之间的对应关系。
  4. 根据权利要求3所述的方法,其特征在于,
    所述目标资源冗余方式为主备冗余、负荷分担冗余、线程池冗余和网络路径冗余中的任意一种。
  5. 根据权利要求1或2所述的方法,其特征在于,所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略,包括:
    若所述第一功能实体可执行所述第一业务恢复策略,则所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
  6. 根据权利要求1或2所述的方法,其特征在于,所述多个功能实体还包括第三功能实体,所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略, 包括:
    若所述第一功能实体无法执行所述第一业务恢复策略,则所述第一功能实体向所述第三功能实体发送第二事件通知消息,所述第二事件通知消息携带有所述第一目标功能实体的标识,使得所述第三功能实体根据所述第一目标功能实体的标识,确定出待修复的第二目标功能实体,并从预先创建的第二对应关系集合中查询所述第二目标功能实体对应的第二可被替换最小功能实体和第二业务恢复策略,以及对所述第二可被替换最小功能实体执行所述第二业务恢复策略。
  7. 根据权利要求1或2所述的方法,其特征在于,所述多个功能实体还包括第四功能实体,所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略之后,所述方法还包括:
    所述第一功能实体向所述第四功能实体发送第三事件通知消息,所述第三事件通知消息携带有所述第一目标功能实体的标识,使得所述第四功能实体根据所述第一目标功能实体的标识,确定出待修复的第三目标功能实体,并从预先创建的第三对应关系集合中查询所述第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对所述第三可被替换最小功能实体执行所述第三业务恢复策略。
  8. 根据权利要求7所述的方法,其特征在于,所述第一功能实体对所述第一可被替换最小功能实体执行所述第一业务恢复策略之后,所述第一功能实体向所述第四功能实体发送第三事件通知消息之前,所述方法还包括:
    若所述第一可被替换最小功能实体的业务未恢复成功,则所述第一功能实体执行向所述第四功能实体发送第三事件通知消息的步骤。
  9. 一种业务恢复装置,应用于包括多个功能实体的网络,所述多个功能实体包括所述业务恢复装置,其特征在于,所述装置包括:
    获取模块,用于获取第一事件通知消息,所述第一事件通知消息携带有待修复的第一目标功能实体的标识;
    查询模块,用于从预先创建的第一对应关系集合中,查询所述标识代表的所述第一目标功能实体对应的第一可被替换最小功能实体和第一业务恢复策略,所述第一对应关系集合包括所述第一目标功能实体、所述第一可被替换最小功能实体和所述第一业务恢复策略之间的对应关系;
    处理模块,用于对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
  10. 根据权利要求9所述的装置,其特征在于,
    所述第一可被替换最小功能实体为所述第一目标功能实体,所述第一目标功能实体可被所述网络替换;
    或者,
    所述第一可被替换最小功能实体为预先创建的关联关系集合中,所述第一目标功能实体关联的第二功能实体,所述第一目标功能实体不可被所述网络替换,所述关联关系集合 包括所述第一目标功能实体和所述第二功能实体之间的关联关系。
  11. 根据权利要求9或10所述的装置,其特征在于,所述处理模块,包括:
    查询单元,用于从所述第一对应关系集合中查询所述第一目标功能实体对应的目标资源冗余方式;
    执行单元,用于根据所述目标资源冗余方式对所述第一可被替换最小功能实体执行所述第一业务恢复策略;
    其中,所述第一对应关系集合还包括所述第一目标功能实体和所述目标资源冗余方式之间的对应关系。
  12. 根据权利要求9或10所述的装置,其特征在于,所述处理模块,具体用于:
    若可执行所述第一业务恢复策略,则对所述第一可被替换最小功能实体执行所述第一业务恢复策略。
  13. 根据权利要求9或10所述的装置,其特征在于,所述多个功能实体还包括第三功能实体,所述处理模块,具体用于:
    若无法执行所述第一业务恢复策略,则向所述第三功能实体发送第二事件通知消息,所述第二事件通知消息携带有所述第一目标功能实体的标识,使得所述第三功能实体根据所述第一目标功能实体的标识,确定出待修复的第二目标功能实体,并从预先创建的第二对应关系集合中查询所述第二目标功能实体对应的第二可被替换最小功能实体和第二业务恢复策略,以及对所述第二可被替换最小功能实体执行所述第二业务恢复策略。
  14. 根据权利要求9或10所述的装置,其特征在于,所述多个功能实体还包括第四功能实体,所述装置还包括:
    发送模块,用于向所述第四功能实体发送第三事件通知消息,所述第三事件通知消息携带有所述第一目标功能实体的标识,使得所述第四功能实体根据所述第一目标功能实体的标识,确定出待修复的第三目标功能实体,并从预先创建的第三对应关系集合中查询所述第三目标功能实体对应的第三可被替换最小功能实体和第三业务恢复策略,以及对所述第三可被替换最小功能实体执行所述第三业务恢复策略。
  15. 根据权利要求14所述的装置,其特征在于,所述发送模块,具体用于若所述第一可被替换最小功能实体的业务未恢复成功,则向所述第四功能实体发送第三事件通知消息。
  16. 根据权利要求9所述的装置,其特征在于,
    所述装置包括网络功能虚拟化NFV网络中的NFV基础设施NFVI、虚拟网络功能管理器VNFM和VNF组件VNFC/VNF中的一种或多种。
  17. 一种业务恢复装置,其特征在于,包括:处理器、收发器和存储器,所述处理器、所述收发器和所述存储器通过总线连接,所述存储器存储有可执行程序代码,所述收发器受所述处理器的控制用于收发消息,所述处理器用于调用所述可执行程序代码,执行如权利要求1ˉ8中任一项所述的业务恢复方法。
PCT/CN2018/072909 2017-01-24 2018-01-16 一种业务恢复方法及装置 WO2018137520A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710057297.9A CN108347339B (zh) 2017-01-24 2017-01-24 一种业务恢复方法及装置
CN201710057297.9 2017-01-24

Publications (1)

Publication Number Publication Date
WO2018137520A1 true WO2018137520A1 (zh) 2018-08-02

Family

ID=62961910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/072909 WO2018137520A1 (zh) 2017-01-24 2018-01-16 一种业务恢复方法及装置

Country Status (2)

Country Link
CN (1) CN108347339B (zh)
WO (1) WO2018137520A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026577A (zh) * 2019-12-27 2020-04-17 中国水产科学研究院渔业机械仪器研究所 软件系统功能自恢复的软件架构方法及其系统
CN112910669A (zh) * 2019-12-03 2021-06-04 中盈优创资讯科技有限公司 故障智能化处理方法、装置及系统
CN113438117A (zh) * 2021-07-09 2021-09-24 中国电信股份有限公司 网元工单的处理方法及装置、存储介质、电子设备
CN114205060A (zh) * 2020-09-17 2022-03-18 上海朗帛通信技术有限公司 一种被用于无线通信的节点中的方法和装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110868309A (zh) * 2018-08-27 2020-03-06 中移(苏州)软件技术有限公司 Vnfm中资源处理的方法、装置及计算机存储介质
CN111277469B (zh) * 2020-02-19 2020-12-08 杭州梅清数码科技有限公司 网络诊断处理方法、装置、网络系统及服务器
CN115174363B (zh) * 2022-07-05 2024-04-12 云合智网(上海)技术有限公司 多保护组集合快速切换方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104170323A (zh) * 2014-04-09 2014-11-26 华为技术有限公司 基于网络功能虚拟化的故障处理方法及装置、系统
CN104685830A (zh) * 2013-09-30 2015-06-03 华为技术有限公司 故障管理的方法、实体和系统
CN105049293A (zh) * 2015-08-21 2015-11-11 中国联合网络通信集团有限公司 监控的方法及装置
CN105790980A (zh) * 2014-12-22 2016-07-20 中兴通讯股份有限公司 一种故障修复方法及装置
WO2016184021A1 (zh) * 2015-05-21 2016-11-24 中兴通讯股份有限公司 一种虚拟化网络功能业务故障的处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104685830A (zh) * 2013-09-30 2015-06-03 华为技术有限公司 故障管理的方法、实体和系统
CN104170323A (zh) * 2014-04-09 2014-11-26 华为技术有限公司 基于网络功能虚拟化的故障处理方法及装置、系统
CN105790980A (zh) * 2014-12-22 2016-07-20 中兴通讯股份有限公司 一种故障修复方法及装置
WO2016184021A1 (zh) * 2015-05-21 2016-11-24 中兴通讯股份有限公司 一种虚拟化网络功能业务故障的处理方法及装置
CN105049293A (zh) * 2015-08-21 2015-11-11 中国联合网络通信集团有限公司 监控的方法及装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112910669A (zh) * 2019-12-03 2021-06-04 中盈优创资讯科技有限公司 故障智能化处理方法、装置及系统
CN112910669B (zh) * 2019-12-03 2023-08-08 中盈优创资讯科技有限公司 故障智能化处理方法、装置及系统
CN111026577A (zh) * 2019-12-27 2020-04-17 中国水产科学研究院渔业机械仪器研究所 软件系统功能自恢复的软件架构方法及其系统
CN111026577B (zh) * 2019-12-27 2023-10-31 中国水产科学研究院渔业机械仪器研究所 软件系统功能自恢复的软件架构方法及其系统
CN114205060A (zh) * 2020-09-17 2022-03-18 上海朗帛通信技术有限公司 一种被用于无线通信的节点中的方法和装置
CN113438117A (zh) * 2021-07-09 2021-09-24 中国电信股份有限公司 网元工单的处理方法及装置、存储介质、电子设备
CN113438117B (zh) * 2021-07-09 2022-11-25 中国电信股份有限公司 网元工单的处理方法及装置、存储介质、电子设备

Also Published As

Publication number Publication date
CN108347339B (zh) 2020-06-16
CN108347339A (zh) 2018-07-31

Similar Documents

Publication Publication Date Title
WO2018137520A1 (zh) 一种业务恢复方法及装置
US10432460B2 (en) Network service scaling method and apparatus
US11336567B2 (en) Service aware virtual private network for optimized forwarding in cloud native environment
US11003553B2 (en) Method and apparatus for failover processing
US20210326167A1 (en) Vnf service instantiation method and apparatus
CN105743692B (zh) 用于应用管理的基于策略的框架
US10481933B2 (en) Enabling virtual machines access to switches configured by different management entities
CN104956332B (zh) 一种用于管理计算资源的方法、存储媒体和计算系统
JP6466003B2 (ja) Vnfフェイルオーバの方法及び装置
US9489230B1 (en) Handling of virtual machine migration while performing clustering operations
US10541862B2 (en) VNF processing policy determining method, apparatus, and system
BR112020025410A2 (pt) Método e aparelho de alarme
US9223606B1 (en) Automatically configuring and maintaining cluster level high availability of a virtual machine running an application according to an application level specified service level agreement
US11567790B2 (en) Node regeneration in distributed storage systems
US11102278B2 (en) Method for managing a software-defined data center implementing redundant cloud management stacks with duplicate API calls processed in parallel
JP2015103092A (ja) 障害回復システム及び障害回復システムの構築方法
WO2017008578A1 (zh) 网络功能虚拟化架构中数据检查的方法和装置
US9654390B2 (en) Method and apparatus for improving cloud routing service performance
JP2013134658A (ja) コンピュータネットワークシステム、構成管理方法、構成管理プログラム、記録媒体
US20150142960A1 (en) Information processing apparatus, information processing method and information processing system
US11474827B1 (en) Reboot migration between bare-metal servers
US20150154083A1 (en) Information processing device and recovery management method
CN117271064A (zh) 一种虚拟机管理方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18744143

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18744143

Country of ref document: EP

Kind code of ref document: A1