CN117675505A - Event processing method, device and system - Google Patents

Event processing method, device and system Download PDF

Info

Publication number
CN117675505A
CN117675505A CN202211345438.4A CN202211345438A CN117675505A CN 117675505 A CN117675505 A CN 117675505A CN 202211345438 A CN202211345438 A CN 202211345438A CN 117675505 A CN117675505 A CN 117675505A
Authority
CN
China
Prior art keywords
event
network
server
resource manager
manager
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211345438.4A
Other languages
Chinese (zh)
Inventor
蒋忠平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to PCT/CN2023/100793 priority Critical patent/WO2024051258A1/en
Publication of CN117675505A publication Critical patent/CN117675505A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0894Policy-based network configuration management

Abstract

An event processing method, device and system belong to the technical field of networks. The method comprises the following steps: the resource manager determines a first server according to a first message sent by the network manager and executes an event processing strategy related to the first server. The first server is a server of servers accessing the communication network that may be affected by a first event (e.g., a network failure) occurring in the communication network, the network manager being for managing the communication network, and the resource manager being for managing the first server. The present application helps to avoid the impact of a first event occurring in a communication network on traffic carried by a first server accessing the communication network.

Description

Event processing method, device and system
The present application claims priority to chinese patent application No. 202211093655.9 entitled "a method and apparatus for fault handling" from 2022, 09, 08, the entire contents of which are incorporated herein by reference.
Technical Field
The present invention relates to the field of network technologies, and in particular, to a method, an apparatus, and a system for event processing.
Background
A communication network may provide a service forwarding service for a server accessing the communication network, and a failure of the communication network easily affects a service carried by the server, so a scheme is required to avoid an influence of a network failure on the service carried by the server.
Disclosure of Invention
The application provides an event processing method, an event processing device and an event processing system, which are beneficial to avoiding the influence of network events (such as network faults) occurring in a communication network on services carried by a server accessed to the communication network. The technical scheme of the application is as follows.
In a first aspect, there is provided an event processing method, the method comprising: the resource manager receives a first message sent by a network manager, wherein the network manager is used for managing a communication network; the resource manager determines a first server according to the first message, wherein the first server is a server which is possibly influenced by a first event occurring in the communication network among servers accessed to the communication network, and the resource manager is used for managing the first server; the resource manager enforces a first server-related event handling policy.
At present, when a communication network has network faults, a resource manager cannot timely sense the network faults, and only after a service user senses the service faults and reports the service faults to a service manager, the service manager and the network manager can jointly check the reasons of the service faults, and after the fact that the reasons of the service faults are the network faults is determined, the network manager checks the reasons of the network faults again, and then measures such as network repair are implemented. However, the manual checking process takes a long time, which easily causes long-time interruption of the service and affects the continuity of the service.
According to the technical scheme, after the network manager determines that the communication network has the first event (such as network fault), the first message is sent to the resource manager, the resource manager determines the first server possibly influenced by the first event in the servers accessed to the communication network according to the first message, and executes the event processing strategy related to the first server, so that the resource manager can timely sense that the communication network has the first event and execute the related event processing strategy, the first event is prevented from influencing the service borne by the first server, and the continuity of the service borne by the first server is ensured.
Optionally, the event processing policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
Optionally, the event handling policy includes an event flag, and the resource manager executes the first server-related event handling policy, including: the resource manager marks the event on the first server. The resource manager marks the event on the first server, so that the resource manager can avoid that the newly issued service is deployed on the first server before the communication network releases the first event, and the first event generated by the communication network is prevented from affecting the operation of the service.
Optionally, the event handling policy includes service migration, and the resource manager executes the first server-related event handling policy, including: the resource manager migrates the first traffic carried by the first server to a second server, the second server being managed by the resource manager, the second server being unaffected by the first event. The resource manager migrates the first service carried by the first server to the second server which is not affected by the first event, so that the first event can be prevented from affecting the operation of the first service.
Optionally, the event handling policy includes backup service enablement, and the resource manager executes the first server-related event handling policy, including: the resource manager enables a backup service of a second service, the second service being carried by the first server, the backup service being carried by a third server managed by the resource manager, the third server being unaffected by the first event. The third server and the second server may be the same server or two different servers. The resource manager enables the backup service of the second service, and can avoid the first event affecting the operation of the second service.
Optionally, the event handling policy includes an alert, and the resource manager executes the first server-related event handling policy, including: the resource manager issues an alert for the first server. The resource manager gives an alarm for the first server, so that a worker can know that the first server is possibly influenced by a first event generated by the communication network, and further manually execute processing measures to avoid the first event influencing the service borne by the first server, and the continuity of the service borne by the first server is guaranteed.
Optionally, the first message includes indication information of the first server. The indication information of the first server is, for example, an identification of the first server, an address of the first server.
Optionally, the first message includes device indication information for indicating a device in the communication network that has occurred the first event. The device indication information may be an identification of a device in the communication network where the first event occurred, an address of the device in the communication network where the first event occurred.
Optionally, the first message includes device indication information, and the resource manager determines the first server according to the first message, including: the resource manager determines equipment with a first event in the communication network according to the equipment indication information; the resource manager determines the first server from a device in the communication network where the first event occurred.
Optionally, the first message further comprises at least one of: event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event; the network card indication information is used for indicating the network card which may be affected by the first event in the first server. Wherein an interface in the communication network that may be affected by the first event refers to an interface in a device (e.g., a network device) comprised in the communication network that may be affected by the first event.
Optionally, the method further comprises: the resource manager receives a second message sent by the network manager; the resource manager determines from the second message that the first server is not affected by the first event.
Optionally, the method further comprises: the resource manager releases the first server-related event handling policy. The resource manager releases the event handling policy associated with the first server, which may facilitate the deployment of services by the resource manager on the first server.
Optionally, the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
Optionally, the network failure includes at least one of: the network equipment has complete machine fault, optical module fault, interface fault, unreachable between the network equipment and the appointed monitoring point, and two network equipment in the same cross-equipment link aggregation group (multi-chassis link aggregation group, MLAG) are all main equipment, network outlet fault and network security equipment fault.
Optionally, the network indicator failure to meet the requirement includes at least one of: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a single device.
Optionally, the network manager interfaces with the resource manager through an application programming interface (application programming interface, API), and the first message and the second message are both API messages.
In a second aspect, there is provided an event processing method, the method comprising: the network manager determining that a first event occurred in the communication network, the network manager being configured to manage the communication network; the network manager sends a first message to the resource manager, the first message being for the resource manager to determine a first server and to execute an event handling policy associated with the first server, the first server being a server of the servers accessing the communication network that may be affected by the first event, the resource manager being for managing the first server.
According to the technical scheme, after the network manager determines that the communication network has the first event (such as network fault), the first message is sent to the resource manager, the resource manager determines the first server possibly influenced by the first event in the servers accessed to the communication network according to the first message, and executes the event processing strategy related to the first server, so that the resource manager can timely sense that the communication network has the first event and execute the related event processing strategy, the first event is prevented from influencing the service borne by the first server, and the continuity of the service borne by the first server is ensured.
Optionally, the event processing policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
Optionally, the first message includes indication information of the first server.
Optionally, the method further comprises: the network manager determining a device in the communication network in which a first event occurred; the network manager determines a first server from a device in the communication network where the first event occurred.
Optionally, the first message includes device indication information for indicating a device in the communication network that has occurred the first event.
Optionally, the first message further comprises at least one of: event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event; the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
Optionally, the method further comprises: the network manager determines that the communication network releases the first event; the network manager sends a second message to the resource manager, the second message being for the resource manager to determine that the first server is not affected by the first event. The network manager sends a second message to the resource manager after determining that the communication network releases the first event, so that the resource manager can conveniently determine that the first server is not affected by the first event, and further release the event processing policy related to the first server.
Optionally, the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
Optionally, the network failure includes at least one of: the method comprises the steps of network equipment complete machine fault, optical module fault, interface fault, unreachable between the network equipment and a designated monitoring point, and main equipment, network outlet fault and network security equipment fault of two network equipment in the same MLAG.
Optionally, the network indicator failure to meet the requirement includes at least one of: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a single device.
Optionally, the network manager interfaces with the resource manager through an API, and the first message and the second message are both API messages.
In a third aspect, there is provided an event processing apparatus for use in a resource manager, the event processing apparatus comprising means for performing a method as provided in the first aspect or any of the alternatives of the first aspect.
Optionally, the event processing device includes:
the receiving module is used for receiving a first message sent by a network manager, and the network manager is used for managing a communication network;
and the processing module is used for determining a first server according to the first message and executing an event processing strategy related to the first server, wherein the first server is a server which is possibly influenced by a first event occurring in the communication network in a server accessed to the communication network, and the resource manager is used for managing the first server.
Optionally, the event processing policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
Optionally, the event processing policy includes event marking, and the processing module is configured to mark the event on the first server.
Optionally, the event processing policy includes service migration, and the processing module is configured to migrate a first service carried by the first server to a second server, where the second server is managed by the resource manager, and the second server is not affected by the first event.
Optionally, the event processing policy includes enabling a backup service, and the processing module is configured to enable a backup service of a second service, where the second service is carried by the first server, and the backup service is carried by a third server managed by the resource manager, and the third server is not affected by the first event.
Optionally, the first message includes indication information of the first server.
Optionally, the first message includes device indication information, where the device indication information is used to indicate a device in the communication network where the first event occurs.
Optionally, the processing module is configured to: determining equipment in the communication network, in which the first event occurs, according to the equipment indication information; and determining the first server according to the equipment which generates the first event in the communication network.
Optionally, the first message further includes at least one of:
event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event;
the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
Optionally, the receiving module is further configured to receive a second message sent by the network manager;
the processing module is further configured to determine that the first server is not affected by the first event according to the second message.
Optionally, the processing module is further configured to release the event processing policy related to the first server.
Optionally, the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
Optionally, the network failure includes at least one of: the method comprises the steps of network equipment complete machine fault, optical module fault, interface fault, unreachable between the network equipment and a designated monitoring point, and main equipment, network outlet fault and network security equipment fault of two network equipment in the same MLAG.
Optionally, the network indicator failing to meet the requirement includes at least one of: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a device.
Optionally, the network manager interfaces with the resource manager through an API, and the first message and the second message are both API messages.
In a fourth aspect, there is provided an event processing apparatus for use in a network manager, the event processing apparatus comprising respective modules for performing the method as provided in the second aspect or any of the alternatives of the second aspect.
Optionally, the event processing device includes:
a processing module for determining that a first event occurs in a communication network, the network manager for managing the communication network;
and the sending module is used for sending a first message to a resource manager, wherein the first message is used for determining a first server and executing an event processing strategy related to the first server, the first server is a server possibly affected by the first event in servers accessed to the communication network, and the resource manager is used for managing the first server.
Optionally, the event processing policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
Optionally, the first message includes indication information of the first server.
Optionally, the processing module is further configured to: means for determining that the first event occurred in the communication network; and determining the first server according to the equipment which generates the first event in the communication network.
Optionally, the first message includes device indication information, where the device indication information is used to indicate a device in the communication network where the first event occurs.
Optionally, the first message further includes at least one of:
event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event;
the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
Optionally, the processing module is further configured to determine that the communication network releases the first event;
the sending module is further configured to send a second message to the resource manager, where the second message is used by the resource manager to determine that the first server is not affected by the first event.
Optionally, the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
Optionally, the network failure includes at least one of: the method comprises the steps of network equipment complete machine fault, optical module fault, interface fault, unreachable between the network equipment and a designated monitoring point, and main equipment, network outlet fault and network security equipment fault of two network equipment in the same MLAG.
Optionally, the network indicator failing to meet the requirement includes at least one of: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a device.
Optionally, the network manager interfaces with the resource manager through an API, and the first message and the second message are both API messages.
The modules in the third and fourth aspects described above may be implemented based on software, hardware, or a combination of software and hardware, and the modules may be arbitrarily combined or divided based on specific implementations.
In a fifth aspect, an event processing apparatus is provided, applied to a resource manager, the event processing apparatus including a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute a computer program stored in the memory to cause the event processing device to perform the event processing method as provided in the first aspect or any of the alternatives of the first aspect.
In a sixth aspect, an event processing apparatus is provided and applied to a network manager, where the event processing apparatus includes a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute a computer program stored in the memory to cause the event processing device to perform an event processing method as provided in the second aspect or any of the alternatives of the second aspect.
In a seventh aspect, there is provided an event processing system comprising a resource manager comprising an event processing device as provided in the third or fifth aspect above and a network manager comprising an event processing device as provided in the fourth or sixth aspect above.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a device.
In an eighth aspect, there is provided a computer readable storage medium having stored therein a computer program which when executed implements the event processing method as provided in the first aspect or any of the alternatives of the first aspect, or implements the event processing method as provided in the second aspect or any of the alternatives of the second aspect.
A ninth aspect provides a computer program product comprising a program or code which when executed implements an event processing method as provided in the first aspect or any of the alternatives of the first aspect, or implements an event processing method as provided in the second aspect or any of the alternatives of the second aspect.
In a tenth aspect, there is provided a chip comprising programmable logic circuitry and/or program instructions, the chip being operable to implement the event processing method as provided in the first aspect or any of the alternatives of the first aspect, or to implement the event processing method as provided in the second aspect or any of the alternatives of the second aspect.
The beneficial effects that this application provided technical scheme brought are:
the network manager sends a first message to the resource manager after determining that the communication network has the first event, and the resource manager determines a first server possibly influenced by the first event in the servers accessed to the communication network according to the first message and executes an event processing strategy related to the first server, so that the first event is prevented from influencing the service borne by the first server, the service interruption borne by the first server is avoided, and the continuity of the service borne by the first server is ensured.
Drawings
FIG. 1 is a schematic diagram of an event processing system provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of another event processing system provided by an embodiment of the present application;
FIG. 3 is a flow chart of a method for event handling provided in an embodiment of the present application;
FIG. 4 is a flow chart of another event processing method provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of an event processing device according to an embodiment of the present application;
FIG. 6 is a schematic diagram of another event processing device according to an embodiment of the present application;
fig. 7 is a schematic diagram of still another event processing apparatus according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be further described below with reference to the accompanying drawings. First, an application scenario of the present application is described.
The application scenario of the present application provides an event processing system, which includes a communication network, a network manager, a resource manager, and a server accessing the communication network. The communication network is used for providing business forwarding service for a server accessed to the communication network. The network manager is for managing the communication network and the resource manager is for managing a server accessing the communication network.
The communication network may be a data center network (data center network, DCN), a metropolitan area network, a wide area network, or a campus network, and the communication network may be a Software Defined Network (SDN), which may be a secondary network, also referred to as a two-tier network, or a tertiary network, also referred to as a three-tier network. The communication network comprises a plurality of network devices, which may be switches, routers, virtual switches or virtual routers etc. for traffic forwarding, also called forwarding devices. The network devices in the communication network may be the same type of network device, e.g. the network devices in the communication network are all switches; alternatively, the communication network comprises different types of network devices, for example, one part of the network devices in the communication network is a router and another part of the network devices is a switch. The communication network may also include security devices such as firewalls to secure the communication network.
Wherein each server accessing the communication network may be a server, or a server cluster composed of several servers, or a cloud computing service center, and the servers accessing the communication network may include a computing server and a storage server. The computing server is used for providing service computing functions. The storage server is used for providing business storage services. For example, at least one Virtual Machine (VM) is deployed in a computing server, the computing server provides a service computing function through the VM deployed therein, and the storage server may provide a service storage service for the VM. In some embodiments, the server is also referred to as a site, workstation, host, etc., which embodiments of the present application do not limit.
In an embodiment of the present application, the network device in the communication network includes an access device, and the server accesses the communication network through the access device. In one embodiment, the communication network is a secondary network, the communication network comprises an access layer and a convergence layer, the access layer is used for providing service access functions, the convergence layer is used for providing service convergence functions, the access device is located in the access layer, the network device in the communication network further comprises a convergence device located in the convergence layer, and the convergence device is connected with the access device. In another embodiment, the communication network is a three-level network, the communication network includes an access layer, a convergence layer and a core layer, the access layer is used for providing a service access function, the convergence layer is used for providing a service convergence function, the core layer is used for further converging the service converged by the convergence layer, the access device is located in the access layer, the network device in the communication network further includes a convergence device located in the convergence layer and a core device located in the core layer, and the convergence device is respectively connected with the access device and the core device. The network devices in the communication network are all switches, the access device is an access switch, the aggregation device is an aggregation switch, and the core device is a core switch. The access switches are also known as (leaf) switches, and the aggregation switches are also known as spine (spine) switches.
In an embodiment of the present application, the network manager is connected to a communication network to manage the communication network. The resource manager is coupled to the servers that access the communication network to manage the servers that access the communication network and to schedule resources between the servers that access the communication network. The resource manager is also coupled to the network manager, and the resource manager and the network manager cooperate to handle network events (e.g., network failures) occurring in the communication network to avoid that the network events affect traffic carried by servers accessing the communication network. In some embodiments, the network manager is also referred to as a network analyzer, a network controller, a network management system, etc., and the resource manager is also referred to as a resource management system, a computing resource manager, a computing resource management system, a virtual resource management (virtualization resource management, VRM), etc. In addition, the network manager and the resource manager can be two independent devices or different components in the same device. For example, the network manager and the resource manager are two independent management servers. Alternatively, the network manager and the resource manager are different components in a management server. The network manager interfaces with the resource manager through an API. The APIs are north-open interfaces of the resource manager, and the resource manager typically includes one or more APIs to which the network manager communicates information by calling the APIs of the resource manager. The network manager and the resource manager may also interface in other manners, which are not limited in this application.
As an example, please refer to fig. 1, which illustrates a schematic diagram of an event processing system provided in an embodiment of the present application. The event processing system includes a communication network 01, a network manager 02, a resource manager 03, and servers 041-043 accessing the communication network 01. Servers 041-043 are used to carry traffic, e.g., servers 041-043 have VMs deployed to carry traffic by these VMs. The communication network 01 is used to provide traffic forwarding services for servers 041-043. The network manager 02 is connected to the communication network 01 to manage the communication network 01. The resource manager 03 interfaces with the servers 041-043 to manage the servers 041-043. The network manager 02 is further connected to the resource manager 03, and the network manager 02 and the resource manager 03 cooperate to handle network events (such as network failures) occurring in the communication network 01, so as to avoid that the network events affect the services carried by the servers 041 to 043. As shown in fig. 1, the communication network 01 includes network devices 011 to 016, and the network devices 011 to 014 are access devices; the server 041 is connected to the network devices 011-012, and the server 041 is accessed to the communication network 01 through the network devices 011-012; the server 042 is connected to the network devices 012 to 013, and the server 042 is connected to the communication network 01 through the network devices 012 to 013; the server 043 is connected to the network devices 013 to 014 in double, and the server 043 accesses the communication network 01 through the network devices 013 to 014. By way of example, the communication network 01 shown in fig. 1 is a secondary network, the communication network 01 including an access layer and a convergence layer, network devices 011-014 each being located in the access layer, and network devices 015-016 each being located in the convergence layer. In a specific example, communication network 01 is a spine-to-leaf (leaf-spine) topology network, network devices 011-014 are all leaf switches, network devices 015-016 are all spine switches, each spine switch is connected to all leaf switches, and each leaf switch is connected to all spine switches (i.e., a spine switch is fully interconnected with a leaf switch).
Fig. 1 illustrates that the communication network 01 is a secondary network. As another example, please refer to fig. 2, which illustrates a schematic diagram of another event processing system provided in an embodiment of the present application. In fig. 2, the communication network 01 is exemplified by a three-stage network, and the communication network 01 shown in fig. 1 further includes a core layer and a network device 017 (i.e., a core device) located in the core layer, wherein the network device 017 is connected to the network devices 015 to 016, respectively. The description of other structures in the event processing system shown in fig. 2 may refer to the related description of fig. 1, and will not be repeated here.
In the event processing systems shown in fig. 1 and 2, connection between the network manager 02 and the communication network 01 means that the network manager 02 is connected to a network device in the communication network 01, for example, the network manager 02 is connected to each of the network devices 011 to 016, and in fig. 1 and 2, connection lines between the network manager 02 and the communication network 01 are used to indicate that the network manager 02 is connected to the network devices 011 to 016 for brevity. In addition, the event processing system shown in fig. 1 and fig. 2 is only used for example, and is not used for limiting the technical solution of the embodiments of the present application, the event processing system may further include other devices (for example, a security device is further included in the communication network 01), the number of network devices, the number of servers and the connection relationship between the network devices may be configured according to needs, the connection relationship between the network devices and the servers may be other topologies of the communication network. For example, the spine switch and the leaf switch may not be fully interconnected, and for example, network devices in the convergence layer may be interconnected, and for example, the core layer includes a plurality of core devices, which are not described herein in detail.
The communication network is used for providing service forwarding service for a server accessed to the communication network, and the service borne by the server is easily affected when the communication network is in failure. At present, when a communication network has network faults, a resource manager cannot timely sense the network faults, and only after a service user senses the service faults and reports the service faults to a service manager, the service manager and the network manager can jointly check the reasons of the service faults, and after the reason of the service faults is determined to be the network faults, the reason of the network faults is checked, and then measures such as network repair are implemented. However, the manual checking process takes a long time, which easily causes long-time interruption of the service and affects the continuity of the service. In the embodiment of the application, when a network event (such as a network failure) occurs in the communication network, the resource manager and the network manager cooperatively process the network event, so that the network event is prevented from affecting the service borne by the server accessed to the communication network. After the network manager determines that the communication network has the first event, the network manager sends a first message to the resource manager, and the resource manager determines a first server possibly affected by the first event in the servers accessed to the communication network according to the first message and executes an event processing policy related to the first server, so that the resource manager can timely sense that the communication network has the first event and execute the related event processing policy, and avoid that the first event affects the service borne by the first server, thereby avoiding long-time interruption of the service borne by the first server and guaranteeing the continuity of the service borne by the first server.
The above is an introduction to the application scenario of the present application, and the following describes an embodiment of the event processing method of the present application.
Referring to fig. 3, a flowchart of an event processing method according to an embodiment of the present application is shown. The event processing method is applied to an event processing system comprising a network manager and a resource manager. For example, the event processing system is the event processing system shown in fig. 1 or fig. 2. Referring to fig. 3, the event processing method includes the following steps S301 to S305.
S301, the network manager determines that a first event occurs in the communication network.
The network manager is configured to manage a communication network, the first event including at least one network event occurring in the communication network, the event type of the first event including one of: network faults and network indexes can not meet the requirements. For example, the first event is that the network device 011 has a network failure, or the first event is that the network index of the network device 012 fails to meet the requirement, or the first event includes that the network device 011 has a network failure and the network index of the network device 012 fails to meet the requirement.
In an embodiment of the present application, the network failure includes at least one of: the method comprises the steps of network equipment complete machine fault, optical module fault, interface fault, unreachable between the network equipment and a designated monitoring point, and main equipment, network outlet fault and network security equipment fault of two network equipment in the same MLAG. In other embodiments, the type of the network fault may be other types, for example, a power fault of the network device, which is not limited in the embodiments of the present application.
Wherein, the failure of the whole network equipment means that the network equipment cannot work normally. For any network device, the network device complete machine fault comprises at least one of the following: the power failure of the network device (for example, the power failure of the power module of the network device causes the power failure of the network device) causes the complete machine failure of the network device, the processing chip failure of the network device causes the complete machine failure of the network device, and the optical module failure of the network device causes the complete machine failure of the network device. Other causes may also lead to failure of the network device as a whole. It should be noted that, the network device generally includes a fault handling module, and the overall fault of the network device does not include the fault handling module, that is, after the overall fault of the network device, the fault handling module in the network device may also generally work, for example, report a fault message to the network controller, which is not limited in this embodiment of the present application.
Wherein, the failure of the optical module means that the optical module cannot work normally. For any one optical module, the optical module failure includes at least one of: the optical module is broken down to cause the optical module to fail, the optical power of the optical module is too high to cause the optical module to fail, and the optical power of the optical module is too low to cause the optical module to fail. The optical module may be malfunctioning for other reasons. It should be noted that, a network device includes one or more optical modules, and failure of all optical modules of a network device may cause a complete machine failure of the network device, and a cause of the complete machine failure of a network device is not necessarily failure of all optical modules of the network device.
Wherein, interface failure refers to the interface failing to work properly. For any interface, the interface failure includes at least one of: the interface failure causes the interface to fail, the interface circuit failure causes the interface to fail, and the interface DOWN (e.g., the fiber into which the interface is plugged drops off the interface). The interface may fail for other reasons. The interfaces described herein may be physical or logical. Furthermore, an optical module typically includes one or more interfaces, and failure of all interfaces of a certain optical module may cause the optical module to fail, and the cause of failure of an optical module is not necessarily failure of all interfaces of the optical module.
Wherein the unreachable between the network device and the designated monitoring point comprises at least one of the following: the link failure between the network device and the designated monitoring point results in the unreachable between the network device and the designated monitoring point, the network device failure results in the unreachable between the network device and the designated monitoring point, and the designated monitoring point failure results in the unreachable between the network device and the designated monitoring point. In general, for some network devices carrying important services, a designated monitoring point may be set to monitor the network device, and if the network device is not reachable between the designated monitoring point, the designated monitoring point cannot monitor the network device, so that the application regards the unreachable between the network device and the designated monitoring point as a network fault. The designated monitoring point and the network equipment monitored by the designated monitoring point are located in the same communication network, or the designated monitoring point is located outside the communication network where the network equipment monitored by the designated monitoring point is located. For example, as shown in fig. 1 or fig. 2, a designated monitoring point (not shown in fig. 1 and fig. 2) for monitoring the network device 011 may be located inside the communication network 01 or may be located outside the communication network 01.
Wherein, an MLAG is a networking for implementing cross-device link aggregation, and an MLAG generally includes two network devices, where the two network devices are dual-homing access devices of the same device (e.g. a server), and the two network devices generally include a master device and a standby device, where the master device generally provides access services for the server and performs message transmission with the server; and after the main equipment fails, the standby equipment provides access service for the server and transmits messages with the server. The master-slave roles (i.e., master device and slave device) of two network devices in the same MLAG are negotiated and determined by the two network devices, when the two network devices are both master devices, this indicates that the two network devices negotiate and fail, and when the two network devices are both master devices, the same MLAG generates brain-splitting, which means that the two network devices provide access service for a server and perform message transmission with the server at the same time, which may cause a problem in service forwarding of the server by the two network devices. Therefore, the present application regards two network devices in the same MLAG as both master devices as a network failure. For example, as shown in fig. 1 or fig. 2, the server 041 is connected to the network device 011 and the network device 012 in a dual-homing manner, where the network device 011 and the network device 012 belong to the same MLAG, and when both the network device 011 and the network device 012 are the master devices (i.e., the master access devices) of the server 041, the same MLAG is cracked, and the communication network 01 fails in the network.
Wherein the communication network includes an egress device (or referred to as an egress network device), a network egress failure of the communication network such as, but not limited to, the egress device failure, an egress interface failure of the egress device, an optical module failure where an egress interface of the egress device is located, etc. The network egress is used for outputting network traffic from the communication network (or for network traffic to flow from the communication network), and the egress interface of the egress device refers to the interface on the egress device for network traffic to flow from the communication network. For example, as shown in fig. 2, the network device 017 may be an egress device of the communication network 01, a network egress failure of the communication network 01 such as, but not limited to, a network device 017 failure, an outgoing interface failure of the network device 017, an optical module failure where an outgoing interface of the network device 017 is located, and the like.
Among them, a communication network generally includes a network security device to secure the security of the communication network, and if the network security device fails, the security of the communication network may be reduced, so the present application regards the network security device failure as a network failure. The network security device may be a security device such as a firewall. Illustratively, a network security device failure includes a power outage of the network security device, a loss of security functionality of the network security device, and so on.
In the embodiment of the present application, the network indicator failing to meet the requirement includes at least one of the following: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices. In other embodiments, other network indexes of the communication network may not meet the corresponding index requirements, for example, the transmission rate of a certain network device may not meet the requirements, the transmission delay of another network device may not meet the requirements, the packet loss rate of another network device may not meet the requirements, etc., which is not limited in the embodiments of the present application.
Wherein the used resources of the network device exceeding the preset resource threshold comprise at least one of: the size of the forwarding table of the network device exceeds a preset size (or the data volume of the forwarding table exceeds a preset data volume), and the number of the two-layer subinterfaces of the network device exceeds a preset number. The two-layer sub-interface of the network device is obtained by dividing the physical interface of the network device. In general, each physical interface may be divided into a plurality of logical interfaces, where the number of logical interfaces divided by each physical interface cannot exceed a first preset number, and the number of all logical interfaces of each network device (i.e., the sum of the numbers of logical interfaces divided by all physical interfaces of the network device) cannot exceed a second preset number. The number of two-layer subinterfaces of any network device exceeding a preset number includes at least one of: the number of logical interfaces divided by any physical interface of the network device exceeds a first preset number, and the number of all logical interfaces of the network device exceeds a second preset number.
Wherein the bandwidth utilization of the link between the network devices exceeding the preset bandwidth threshold comprises at least one of: the bandwidth utilization of the link between the access device and the convergence device exceeds a first preset bandwidth threshold, the bandwidth utilization of the link between the convergence device and the core device exceeds a second preset bandwidth threshold, and the bandwidth utilization of the link between different network devices in the same layer (e.g., convergence layer) exceeds a third preset bandwidth threshold. Other possibilities are also possible, without limitation. The first preset bandwidth threshold, the second preset bandwidth threshold and the third preset bandwidth threshold may be the same or different.
Wherein no backup link between network devices comprises at least one of: there is no backup link between the access device and the convergence device, and there is no backup link between the convergence device and the core device. In general, in order to ensure the reliability of communication between the access layer and the convergence layer, at least two links are arranged between each access device and the convergence layer; in order to ensure the reliability of communication between the convergence layer and the core layer, at least two links are arranged between each convergence device and the core layer. When there is no standby link between an access device and the convergence layer, there may be only one link or no link between the access device and the convergence layer, so the access device cannot meet the link index requirement. Similarly, when there is no standby link between a certain aggregation device and the core layer, there may be only one link or no link between the aggregation device and the core layer, so the aggregation device cannot meet the link index requirement.
In the embodiment of the application, when a network event occurs in the communication network, the communication network sends an event notification message to a network manager, and the network manager determines that a first event occurs in the communication network according to the event notification message. Or the network manager collects the event information of the communication network in real time, and the network manager determines that the communication network generates a first event according to the collected event information of the communication network. The embodiments of the present application are described with reference to a communication network sending an event notification message to a network manager, and it will be understood that the communication network sending an event notification message to a network manager, and in particular, a device (e.g., a network device) in the communication network sending an event notification message to a network manager. Wherein devices in the communication network may send event notification messages to the network manager via border gateway protocol (border gateway protocol, BGP), network configuration protocol (network configuration protocol, netcon), path computation element communication protocol (path computation element communication protocol, PCEP), telemetry (telemetry) protocol, or other proprietary protocols. The event notification message sent by any device to the network manager may be a log message of the any device. The event notification message sent by any device to the network manager may include at least one of: device indication information, event type information, event details. The event type information is used to indicate an event type of a network event. The device indication information is used to indicate a device that has occurred the network event, and is an Identifier (ID) of the device that has occurred the network event, an address of the device that has occurred the network event, and the like. The event details include specific content of the network event, including, for example, a cause of occurrence of the network event, a time of occurrence of the network event, and the like. The event notification message may also include other content, which is not limited in this embodiment of the present application.
In one example, referring to fig. 1 or 2, assuming that optical module 1 in network device 011 fails, an event notification message sent by network device 011 to a network manager may include the contents as shown in table 1 below.
TABLE 1
Device indication information Event type information Event details
011 Optical module failure The optical power of the optical module 1 is too high
Referring to table 1, the network manager may determine that the network device 011 has an optical module failure according to an event notification message sent by the network device 011, and determine that the cause of the failure (i.e., event details) is that the optical power of the optical module 1 in the network device 011 is too high. Thus, the network manager determines that the first event occurring in the communication network is: optical module 1 of network device 011 fails.
In another example, referring to fig. 1 or 2, assuming that the size of the forwarding table of the network device 012 exceeds 500K (preset size), the event notification message transmitted by the network device 012 to the network manager may include contents as shown in table 2 below.
TABLE 2
Device indication information Event type information Event details
012 The size of the forwarding table exceeds a preset size The size of forwarding table 1 exceeds 500K
Referring to table 2, the network manager may determine, from the event notification message sent by the network device 012, that the network event occurred by the network device 012 is: the size of the forwarding table exceeds a preset size, and it is determined that the event details of the network event are that the size of the forwarding table 1 of the network device 012 exceeds 500K. Thus, the network manager determines that the first event occurring in the communication network is: the size of the forwarding table 1 of the network device 012 exceeds a preset size.
In still another example, referring to fig. 1 or fig. 2, assuming that the optical module 1 in the network device 011 fails, the size of the forwarding table of the network device 012 exceeds 500K (preset size), the event notification message transmitted by the network device 011 to the network manager includes the contents as shown in table 1 above, and the event notification message transmitted by the network device 012 to the network manager includes the contents as shown in table 2 above. The network manager determines, according to the event notification message sent by the network device 011 and the event notification message sent by the network device 012, that the first event occurring in the communication network is: optical module 1 of network device 011 fails and the size of forwarding table 1 of network device 012 exceeds a preset size.
S302, the network manager sends a first message to the resource manager.
After the network manager determines that the communication network has a first event, the network manager sends a first message to the resource manager, the first message being used by the resource manager to determine a first server, the first server being a server that may be affected by the first event among servers that access the communication network. Optionally, the network manager interfaces with the resource manager through an API, the first message is an API message, and the network manager invokes the first API of the resource manager to send the first message to the resource manager. In other embodiments, the first message may also be another message, where the first message may also be used by the resource manager to execute an event handling policy related to the first server, which is not limited in this embodiment of the application.
In the embodiment of the present application, the first message includes the following two possible implementations.
The implementation mode is as follows: the first message includes device indication information for indicating a device in the communication network where the first event occurred, the device indication information for the resource manager to determine the first server. The device indication information may be an identification, an address, etc. of a device in the communication network where the first event occurred.
Referring to fig. 1 or fig. 2, in one example, the first event is a failure of optical module 1 of network device 011, and the first message includes indication information "011" of network device 011. In another example, the first event is that the size of the forwarding table 1 of the network device 012 exceeds a preset size, and the first message includes indication information "012" of the network device 012. In yet another example, the first event is that the optical module 1 of the network device 011 fails and the size of the forwarding table 1 of the network device 012 exceeds a preset size, and the first message includes the indication information "011" of the network device 011 and the indication information "012" of the network device 012.
The implementation mode II is as follows: the first message includes indication information of the first server, which is used by the resource manager to determine the first server. The indication information of the first server may be an identification of the first server, an address of the first server, etc. For this implementation two, the network manager determines the first server prior to S302. In an alternative embodiment, the network manager determines a device in the communication network where the first event occurred, and the network manager determines the first server based on the device in the communication network where the first event occurred. In a specific embodiment, the network manager determines the first server from among servers accessing the communication network according to the device in the communication network where the first event occurs, the network topology of the communication network, and the servers under each access device in the communication network. Wherein the network manager may obtain the network topology of the communication network via an interior gateway protocol (interior gateway protocol, IGP) and determine servers that are down-hanging for individual access devices in the communication network.
Referring to fig. 1 or fig. 2, in one example, the first event is that the optical module 1 of the network device 011 fails, and the optical module 1 of the network device 011 is connected to the network card 1 of the server 041, and the network manager determines that, according to the first event, the network topology of the communication network 01, and the server hung under the optical module 1 of the network device 011, a first server possibly affected by the first event among the servers accessing the communication network 01 includes the server 041, and in this example, the first message includes the indication information "041" of the server 041. In another example, the first event is that the size of the forwarding table 1 of the network device 012 exceeds a preset size, and the network manager determines that, according to the first event, the network topology of the communication network 01, and the servers hung under the network device 012, a first server possibly affected by the first event among the servers accessing the communication network 01 includes the server 041 and the server 042, and in this example, the first message includes the indication information "041" of the server 041 and the indication information "042" of the server 042. In yet another example, the first event is that the optical module 1 of the network device 011 fails and the size of the forwarding table 1 of the network device 012 exceeds a preset size, and the optical module 1 of the network device 011 is connected to the network card 1 of the server 041, and the network manager determines that the first server in the communication network 01 that may be affected by the first event includes the server 041 and the server 042 according to the first event, the network topology of the communication network 01, the server under which the optical module 1 of the network device 011 is suspended, and the server under which the network device 012 is suspended, and in this example, the first message includes the indication information "041" of the server 041 and the indication information "042" of the server 041.
In an alternative embodiment, the first message further comprises at least one of: event type information, interface indication information, network card indication information, and VM indication information. The event type information is used to indicate an event type of the first event. The interface indication information is used to indicate an interface in the communication network that may be affected by the first event, that is, an interface on a device in the communication network that may be affected by the first event, for example, if the optical module 1 of the network device 011 fails, then the interfaces on the optical module 1 are all interfaces affected by the first event. The network card indication information is used for indicating the network card which may be affected by the first event in the first server. One server typically includes at least one network card through which any server accesses the communication network, e.g., the network card of any server is connected to at least one access device of the communication network such that any server accesses the communication network through the at least one access device. For example, the first event includes a failure of optical module 1 of network device 011, and an interface on optical module 1 of network device 011 is connected to network card 1 in server 041, network card 1 in server 041 is the network card affected by the first event. The VM indication information is used for indicating a VM in the first server which is possibly affected by the first event, and if the first server is affected by the first event, all or part of the VM in the first server is affected by the first event. For example, the server 041 accesses the communication network 01 through the access device 011 and the access device 012, the VM411 in the server 041 accesses the communication network 01 through the access device 011, the VM412 and the VM413 in the server 041 access the communication network 01 through the access device 012, and when the first event affects the access device 011 without affecting the access device 012, the server 041 is affected by the first event by being connected to the access device 011, but since the VM411 in the server 041 accesses the communication network 01 through the access device 011, the VM412 and the VM413 in the server 041 access the communication network 01 through the access device 012, the first event affects only the VM411 without affecting the VM412 and the VM413.
Optionally, for the first implementation manner, the first message further includes event type information and interface indication information, and for the second implementation manner, the first message further includes event type information, interface indication information and network card indication information.
In one example, the first event is a failure of optical module 1 of network device 011, and for implementation one described above, the first message includes what is shown in table 3 below. For implementation two above, the first message includes the contents as shown in table 4 below.
TABLE 3 Table 3
Device indication information Event type information Interface indication information
011 Optical module failure 011-P1、011-P2、011-P3
TABLE 4 Table 4
Indication information of first server Event type information Interface indication information Network card indication information
041 Optical module failure 011-P1、011-P2、011-P3 041-1
In another example, the first event is that the size of the forwarding table 1 of the network device 012 exceeds a preset size, and for the above implementation one, the first message includes the contents as shown in table 5 below. For implementation two above, the first message includes the contents as shown in table 6 below.
TABLE 5
Device indication information Event type information Interface indication information
012 The size of the forwarding table exceeds a preset size 012-P1、012-P2、012-P3、012-P4
TABLE 6
In yet another example, the first event is that the optical module 1 of the network device 011 fails and the size of the forwarding table 1 of the network device 012 exceeds a preset size, and for the above implementation one, the first message includes the contents as shown in table 7 below. For implementation two above, the first message includes what is shown in table 8 below.
TABLE 7
Device indication information Event type information Interface indication information
011 Optical module failure 011-P1、011-P2、011-P3
012 The size of the forwarding table exceeds a preset size 012-P1、012-P2、012-P3、012-P4
TABLE 8
In the above tables 3 to 8, the interface instruction information "011-P1", "011-P2", and "011-P3" are used to instruct the interface P1, the interface P2, and the interface P3 on the network device 011 in order. The interface indication information "012-P1", "012-P2", "012-P3", and "012-P4" are used to indicate the interface P1, the interface P2, the interface P3, and the interface P4 on the network device 012 in this order. The network card indication information "041-1" is used to indicate network card 1 in the server 041. The network card indication information "042-1" is used to indicate the network card 1 in the server 042.
S303, the resource manager receives a first message sent by the network manager.
Optionally, the resource manager receives the first message sent by the network manager through a first API of the resource manager.
S304, the resource manager determines a first server according to the first message, wherein the first server is a server possibly affected by a first event in servers accessed to the communication network.
The resource manager is used for managing the first server. By way of example, the resource manager is for managing a server accessing the communication network, the first server being a server accessing the communication network, and the resource manager is therefore for managing the first server.
In this embodiment of the present application, according to the content included in the first message, the resource manager determines, according to the first message, that the first server includes the following two possible implementations.
Implementation one (corresponding to implementation one in S302): the first message includes device indication information indicating a device in the communication network where the first event occurred, and the resource manager determines a device in the communication network where the first event occurred according to the device indication information, and further determines the first server according to the device in the communication network where the first event occurred. In a specific embodiment, the resource manager determines the first server from among the servers accessing the communication network according to the device in the communication network where the first event occurs, the network topology of the communication network, and the servers under each access device in the communication network. The method, the device and the system for obtaining the network topology of the communication network through the IGP, and determining the server under each access device in the communication network, the resource manager may obtain the network topology of the communication network from the network manager and determine the server under each access device in the communication network, or the resource manager directly generates the network topology information of the communication network and the server through other modes, and the mode that the resource manager obtains the network topology information of the communication network and the server is not limited in this embodiment of the application.
Referring to fig. 1 or fig. 2, in one example, the first message includes indication information of the network device 011, the resource manager determines that a device in the communication network 01 where the first event occurs includes the network device 011 according to the indication information of the network device 011 included in the first message, and the resource manager determines that a first server in the servers accessed to the communication network 01, which may be affected by the first event, includes the server 041 according to a network topology of the communication network 01 and servers under the network device 011. In another example, the first message includes indication information of the network device 012, the resource manager determines that a device in the communication network 01 where the first event occurs includes the network device 012 according to the indication information of the network device 012 included in the first message, and the resource manager determines that a first server possibly affected by the first event among servers accessing the communication network 01 includes the server 041 and the server 042 according to a network topology of the communication network 01 and a server under-hung by the network device 012. In still another example, the first message includes indication information of the network device 011 and indication information of the network device 012, the resource manager determines that a device in the communication network 01 where the first event occurs includes the network device 011 and the network device 012 according to the indication information of the network device 011 and the indication information of the network device 012 included in the first message, and the resource manager determines that a first server possibly affected by the first event among servers accessing the communication network 01 includes the server 041 and the server 042 according to a network topology of the communication network 01, a server under the network device 011, and a server under the network device 012.
Implementation two (corresponding to implementation two in S302): the first message includes indication information of the first server, and the resource manager determines the first server according to the indication information of the first server.
Referring to fig. 1 or 2, in one example, the first message includes indication information "041" of the server 041, and the resource manager determines that the first server includes the server 041 according to the indication information "041" of the server 041 included in the first message. In another example, the first message includes indication information "041" of the server 041 and indication information "042" of the server 042, and the resource manager determines that the first server includes the server 041 and the server 042 according to the indication information "041" of the server 041 and the indication information "042" of the server 042 included in the first message.
In an alternative embodiment, the first message further comprises at least one of: event type information, interface indication information, network card indication information, and VM indication information. The event type information is used to indicate an event type of the first event. The interface indication information is used to indicate an interface in the communication network that may be affected by the first event. The network card indication information is used for indicating the network card which may be affected by the first event in the first server. The VM indication information is used to indicate a VM in the first server that may be affected by the first event. The resource manager may also perform at least one of the following with the first message: determining the event type of the first event according to the event type information included in the first message; determining an interface in the communication network that is likely to be affected by the first event according to interface indication information included in the first message; determining a network card which is possibly affected by a first event in the first server according to the network card indication information included in the first message; and determining the VM possibly affected by the first event in the first server according to the VM indicating information included in the first message. In the first implementation manner, when the resource manager determines the first server, the event type information and the interface indication information included in the first message may also be referred to. For example, if the first event includes a failure of the optical module 1 of the network device 011, and the interfaces affected by the first event in the communication network 01 include the interface 1, the interface 2, and the interface 3 on the network device 011, the server connected to any one of the interface 1, the interface 2, and the interface 3 is the first server (i.e., the server possibly affected by the first event among the servers accessing the communication network).
S305, the resource manager executes event processing strategies related to the first server.
After the resource manager determines the first server, the resource manager executes an event handling policy associated with the first server to avoid that the first event affects traffic carried by the first server. Wherein the first server-related event handling policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
In one embodiment, the first server-related event handling policy includes an event marker, and the resource manager event markers the first server according to the event handling policy. Referring to fig. 1 or 2, in one example, the first server includes a server 041, and the resource manager marks the server 041 for an event. In another example, the first server includes server 041 and server 042, and the resource manager marks the event for server 041 and server 042.
Optionally, the resource manager maintains related information of the first server (for example, including an identifier of the first server, an identifier of a service carried by the first server, an identifier of a virtual machine deployed in the first server, a resource usage situation of the first server, etc.), and adds an event identifier to the related information of the first server, so as to perform event marking on the first server; or the resource manager establishes a mapping relation between related information of the first server and the event identifier so as to mark the event of the first server. The resource manager may also perform event marking on the first server in other manners, and the embodiment of the application does not limit the manner in which the resource manager performs event marking on the first server. The resource manager marks the event on the first server, so that the resource manager can avoid that the newly issued service is deployed on the first server before the communication network releases the first event, and the first event generated by the communication network is prevented from affecting the operation of the service.
In another embodiment, the event processing policy associated with the first server includes service migration, and the resource manager migrates the first service carried by the first server to the second server according to the event processing policy, where the second server is managed by the resource manager, and the second server is not affected by the first event, and a communication network accessed by the second server may be the same communication network or different communication networks. In this embodiment, the communication network accessed by the second server and the communication network accessed by the first server are the same communication network, and referring to fig. 1 or fig. 2, in one example, the first server includes a server 041, the second server includes a server 043, and the resource manager migrates the first service carried by the server 041 to the server 043, which is not limited in this embodiment of the present application.
Optionally, the resource manager controls the first server to package the first service into a mirror package, controls the first server to send the mirror package to the second server, and then controls the second server to spread the mirror package and run the first service, so that the resource manager migrates the first service from the first server to the second server. In one example, the first server includes a first VM (e.g., VM411 in server 041) that carries the first traffic, the resource manager controls the first server to package the first VM into an image package and controls the first server to send the image package to the second server, and the resource manager controls the second server to unwind the image package and run the first VM to run the first traffic. In the embodiment of the application, the resource manager migrates the first service from the first server which is possibly influenced by the first event to the second server which is not influenced by the first event, so that the first event can be prevented from influencing the operation of the first service.
In yet another embodiment, the first server-related event handling policy includes a backup service enablement, the resource manager enables a backup service for a second service according to the event handling policy, the second service is carried by the first server, the backup service for the second service is carried by a third server, the third server is managed by the resource manager, and the third server is unaffected by the first event. The third server and the second server may be the same server or two servers. The communication network accessed by the third server and the communication network accessed by the first server can be the same communication network or different communication networks. In this embodiment of the present application, the communication network accessed by the third server and the communication network accessed by the first server are the same communication network, and referring to fig. 1 or fig. 2, in one example, the first server includes a server 041, the second service is carried by the server 041, the backup service of the second service is carried by a server 043, and the resource manager enables the backup service carried by the server 043. For example, a second service is carried by VM412 in server 041, a backup service for the second service is carried by VM432 in server 043, and the resource manager enables VM432 to enable the backup service. The resource manager enables the backup service of the second service, and can avoid the first event affecting the operation of the second service.
In yet another embodiment, the first server-related event handling policy includes an alert, and the resource manager issues the alert to the first server in accordance with the event handling policy. For example, the resource manager issues an alert signal to the first server. The alert signal may be an acoustic signal, an optical signal, or an alert message. In one example, the resource manager issues an alert tone for the first server. In another example, the resource manager controls the indicator light (which may be located on the resource manager or on the first server) to emit a light of a specific color for the first server. In yet another example, the resource manager controls a particular indicator light (which may be located on the resource manager or on the first server) to illuminate for the first server. In yet another example, the resource manager displays alert information. The resource manager gives an alarm for the first server, so that workers can know that the first server is possibly influenced by a first event generated by the communication network, and then intervene manually, thereby avoiding the first event from influencing the service borne by the first server and guaranteeing the continuity of the service borne by the first server.
It should be noted that the event processing strategies may be used alone or in combination. In one example, a first server accesses a communication network through at least two access devices, one part of the at least two access devices is affected by a first event, and the other part of the at least two access devices is not affected by the first event, in this example, although the first event may affect the available bandwidth of a communication link between the first server and the communication network, the first event does not affect the normal operation of a service carried by the first server, so a resource manager may only event-tag the first server, or the resource manager may event-tag the first server and issue an alarm; the resource manager may not migrate the traffic carried by the first server, or may not enable the backup traffic of the traffic carried by the first server, and of course, the resource manager may migrate some traffic carried by the first server, and/or enable the backup traffic of other traffic carried by the first server. In another example, the access devices connected to the first server are all affected by the first event, the resource manager migrates all or part of the traffic carried by the first server, and/or the resource manager enables a backup service of all or part of the traffic carried by the first server, and the resource manager may also event-mark and issue an alarm to the first server, which is not limited in this embodiment of the present application.
In summary, in the event processing method provided in the embodiment of the present application, after determining that the communication network has a first event, the network manager sends a first message to the resource manager, and the resource manager determines, according to the first message, a first server possibly affected by the first event in servers accessing the communication network, and executes an event processing policy related to the first server. The network manager sends the first message to the resource manager, so that the resource manager can timely sense the first event, and the resource manager executes the event processing strategy related to the first server, so that the first event is prevented from affecting the service borne by the first server, the service interruption borne by the first server is avoided, and the continuity of the service borne by the first server is ensured.
In an alternative embodiment, please refer to fig. 4, which illustrates a flowchart of another event processing method provided in an embodiment of the present application. After S305, the event processing method may include the following steps S306 to S309.
S306, the network manager determines that the communication network releases the first event.
After the network manager determines that the communication network has occurred for the first event, in one example, the network manager processes the first event. For example, the first event is that the size of the forwarding table 1 of the network device 012 exceeds a preset size, and the network manager controls the network device 012 to clear some entries in the forwarding table 1 so that the size of the forwarding table 1 of the network device 012 is smaller than the preset size. In a specific example, the network manager controls the network device 012 to clear some of the aged entries in the forwarding table 1 according to an entry aging mechanism. In another example, the network manager prompts the worker to process a first event, such as a network failure, and the network manager prompts the worker to repair the network failure. After the network manager and/or the staff member processes the first event, the network manager determines that the communication network dismisses the first event. For example, the network manager is manually operated to trigger an event release instruction after the first event is handled by the worker, and the network manager determines that the communication network releases the first event based on the event release instruction.
S307, the network manager sends a second message to the resource manager.
The network manager sends a second message to the resource manager after determining that the communication network is free of the first event, the second message being for the resource manager to determine that the first server is not affected by the first event.
Optionally, the network manager interfaces with the resource manager through an API, the second message is an API message, and the network manager invokes a second APII of the resource manager to send the second message to the resource manager.
In one embodiment, the second message includes indication information of the first server and an event deactivation flag for the resource manager to determine that the first server is not affected by the first event. In another embodiment, the second message includes device indication information and an event deactivation flag, the device indication information included in the second message is the same as the device indication information included in the first message, and the device indication information and the event deactivation flag are used by the resource manager to determine that the first server is not affected by the first event. In yet another embodiment, the second message includes an identification of the first message and an event deactivation identification for the resource manager to determine that the first server is not affected by the first event. The second message may also indicate that the first server is not affected by the first event in other manners, and the second message may also include other content, which is not limited in this embodiment of the present application.
S308, the resource manager receives a second message sent by the network manager.
Optionally, the resource manager receives a second message sent by the network manager through a second API of the resource manager.
S309, the resource manager determines that the first server is not affected by the first event according to the second message.
In one embodiment, the second message includes indication information of the first server and an event release identifier, where the indication information of the first server is used to indicate the first server, the event release identifier is used to indicate the communication network to release the first event, the resource manager determines the first server according to the indication information of the first server, and determines that the communication network releases the first event according to the event release identifier, and further, the resource manager determines that the first server is not affected by the first event.
In another embodiment, the second message includes device indication information and an event release identifier, the device indication information is used for indicating a device in the communication network that a first event occurs, the event release identifier is used for indicating the communication network to release the first event, the resource manager determines a device in the communication network that the first event occurs according to the device indication information, further determines a first server according to the device in the communication network that the first event occurs, and the resource manager determines that the communication network releases the first event according to the event release identifier, further, the resource manager determines that the first server is not affected by the first event.
In yet another embodiment, the second message includes an identification of the first message and an event deactivation identification, the identification of the first message being used to indicate the first message, the event deactivation identification being used to indicate the communication network to deactivate the first event, the resource manager determining the first message based on the identification of the first message, determining the first server based on the first message, and determining the communication network to deactivate the first event based on the event deactivation identification, and further, the resource manager determining that the first server is not affected by the first event. The implementation process of the resource manager to determine the first server according to the first message may refer to the description in S304.
The above exemplary description describes an implementation process that the resource manager determines that the first server is not affected by the first event according to the second message, and the manner in which the resource manager determines that the first server is not affected by the first event is different according to the content of the second message.
In alternative embodiments, the resource manager may determine that the first server is not affected by the first event in addition to determining that the first server is not affected by the first event based on the second message. For example, the resource manager determines that the first server is not affected by the first event by detecting the first server. For example, the first network card of the first server is connected to the first interface of the first access device, where it is assumed that the first event is the first interface DOWN of the first access device (when the first interface DOWN of the first access device is DOWN, the first network card of the first server will also be DOWN, when the first interface UP of the first access device is UP, the first network card of the first server will also be UP), the resource manager may detect whether the first network card of the first server is UP, if the resource manager determines that the first network card UP of the first server is determined by the resource manager, the resource manager determines that the first server is not affected by the first event, otherwise, the resource manager determines that the first server is affected by the first event.
In an alternative embodiment, after S309, the event processing method further includes the following step S310.
S310, the resource manager releases the event processing strategy related to the first server.
After the resource manager determines that the first server is not affected by the first event, the resource manager may release the event handling policy associated with the first server. For example, the resource manager removes the event flag from the first server (e.g., deletes the event identifier in the related information of the first server, deletes the mapping relationship between the related information of the first server and the event identifier, etc.), the resource manager terminates the alert sent to the first server, the resource manager transitions the first service from the second server back to the first server, the resource manager enables the second service carried by the first server, etc.
In summary, according to the event processing method provided by the embodiment of the present application, after determining that the communication network releases the first event, the network manager sends the second message to the resource manager, and the resource manager determines that the first server is not affected by the first event according to the second message, and may release the event processing policy related to the first server, so that the resource of the first server may be re-enabled, which is convenient for the resource manager to deploy a service on the first server, and ensures that the first server can bear the service, and ensures full utilization of the resource of the first server.
The above is an introduction to the embodiment of the event processing method of the present application, and the embodiment of the event processing apparatus of the present application is described below. The event processing apparatus of the present application may be used to perform the event processing method of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
Referring to fig. 5, a schematic diagram of an event processing apparatus 500 according to an embodiment of the present application is shown. The event processing apparatus 500 is applied to a resource manager, for example, the event processing apparatus 500 is a resource manager or a functional component in a resource manager. The event processing device 500 is configured to perform part of the steps of the event processing method shown in fig. 3 or fig. 4. Referring to fig. 5, the event processing apparatus 500 includes a receiving module 510 and a processing module 520.
The receiving module 510 is configured to receive a first message sent by a network manager, where the network manager is configured to manage a communication network. The functional implementation of the receiving module 510 may refer to the relevant description in S303 above.
The processing module 520 is configured to determine a first server according to the first message, and execute an event processing policy related to the first server, where the first server is a server that may be affected by a first event occurring in the communication network among servers accessing the communication network, and the resource manager is configured to manage the first server. The functional implementation of the processing module 520 may refer to the relevant descriptions in S304 to S305 described above.
Optionally, the event processing policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
Optionally, the event processing policy includes an event marker, and the processing module 520 is configured to mark the event for the first server.
Optionally, the event handling policy includes a service migration, and the processing module 520 is configured to migrate the first service carried by the first server to a second server, where the second server is managed by the resource manager and the second server is not affected by the first event.
Optionally, the event handling policy includes a backup service enabling, and the processing module 520 is configured to enable a backup service of a second service, where the second service is carried by the first server, and the backup service is carried by a third server managed by the resource manager, and the third server is not affected by the first event.
Optionally, the first message includes indication information of the first server.
Optionally, the first message includes device indication information for indicating a device in the communication network that has occurred the first event.
Optionally, the processing module 520 is configured to: determining equipment with a first event in the communication network according to the equipment indication information; a first server is determined from a device in the communication network where a first event occurred.
Optionally, the first message further comprises at least one of:
event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event;
the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
Optionally, the receiving module 510 is further configured to receive a second message sent by the network manager. The functional implementation of the receiving module 510 may refer to the relevant description in S308 above.
The processing module 520 is further configured to determine that the first server is not affected by the first event according to the second message. The functional implementation of the processing module 520 may refer to the relevant description in S309 above.
Optionally, the processing module 520 is further configured to release the event processing policy associated with the first server. The functional implementation of the processing module 520 may refer to the relevant description in S310 above.
Optionally, the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
Optionally, the network failure includes at least one of: the method comprises the steps of network equipment complete machine fault, optical module fault, interface fault, unreachable between the network equipment and a designated monitoring point, and main equipment, network outlet fault and network security equipment fault of two network equipment in the same MLAG.
Optionally, the network indicator failure to meet the requirement includes at least one of: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a single device.
Optionally, the network manager interfaces with the resource manager through an API, and the first message and the second message are both API messages.
In summary, in the event processing device provided in the embodiment of the present application, the resource manager determines, according to the first message sent by the network manager, a first server possibly affected by a first event occurring in the communication network in the servers accessing the communication network, and executes an event processing policy related to the first server, so that the resource manager can timely sense that the first event occurs in the communication network and execute the related event processing policy, thereby avoiding the first event affecting the service borne by the first server, avoiding the service interruption borne by the first server, and ensuring the continuity of the service borne by the first server.
Referring to fig. 6, a schematic diagram of another event processing apparatus 600 according to an embodiment of the present application is shown. The event processing apparatus 600 is applied to a network manager, for example, the event processing apparatus 600 is a network manager or a functional component in a network manager. The event processing device 600 is configured to perform part of the steps of the event processing method shown in fig. 3 or fig. 4. Referring to fig. 6, the event processing apparatus 600 includes a processing module 610 and a transmitting module 620.
A processing module 610 is configured to determine that a first event has occurred in a communication network and a network manager is configured to manage the communication network. The functional implementation of the processing module 610 may refer to the relevant description in S301.
A sending module 620, configured to send a first message to a resource manager, where the first message is used by the resource manager to determine a first server and execute an event handling policy related to the first server, where the first server is a server that may be affected by the first event among servers accessing the communication network, and the resource manager is configured to manage the first server. The functional implementation of the processing module 610 may refer to the relevant description in S302.
Optionally, the event processing policy includes at least one of: event marking, service migration, backup service enablement, and alerting.
Optionally, the first message includes indication information of the first server.
Optionally, the processing module 610 is further configured to: determining that a first event has occurred in the communication network; a first server is determined from a device in the communication network where a first event occurred.
Optionally, the first message includes device indication information for indicating a device in the communication network that has occurred the first event.
Optionally, the first message further comprises at least one of:
event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event;
the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
Optionally, the processing module 610 is further configured to determine that the communication network releases the first event. The functional implementation of the processing module 610 may refer to the relevant description in S306.
The sending module 620 is further configured to send a second message to the resource manager, where the second message is used by the resource manager to determine that the first server is not affected by the first event. The functional implementation of the processing module 610 may refer to the relevant description in S307.
Optionally, the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
Optionally, the network failure includes at least one of: the method comprises the steps of network equipment complete machine fault, optical module fault, interface fault, unreachable between the network equipment and a designated monitoring point, and main equipment, network outlet fault and network security equipment fault of two network equipment in the same MLAG.
Optionally, the network indicator failure to meet the requirement includes at least one of: the used resources of the network devices exceed a preset resource threshold, the bandwidth utilization of links between the network devices exceeds the preset bandwidth threshold, and no standby links exist between the network devices.
Optionally, the network manager and the resource manager are two independent devices; alternatively, the network manager and the resource manager are different components in a single device.
Optionally, the network manager interfaces with the resource manager through an API, and the first message and the second message are both API messages.
In summary, according to the event processing method provided by the embodiment of the present application, after the network manager determines that the communication network generates the first event, the network manager sends the first message to the resource manager, and the resource manager determines, according to the first message, the first server possibly affected by the first event in the servers accessed to the communication network and executes the event processing policy related to the first server, so that the resource manager can timely sense that the communication network generates the first event and execute the related event processing policy, thereby avoiding the first event from affecting the service borne by the first server, avoiding the service interruption borne by the first server, and guaranteeing the continuity of the service borne by the first server.
The embodiment of the application provides an event processing device, which comprises a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute a computer program stored in the memory to cause the event processing device to perform all or part of the steps of the event processing method provided by the above method embodiments.
For example, please refer to fig. 7, which illustrates a schematic diagram of still another event processing apparatus 700 provided in an embodiment of the present application. The event processing apparatus 700 is a network manager, a functional component in a network manager, a resource manager, or a functional component in a resource manager. The event processing apparatus 700 includes a processor 701, a memory 702, a bus 703, a network interface 704, and an input output device 705. The processor 701, the memory 702, the network interface 704, and the input-output device 705 are connected by a bus 703. Fig. 7 illustrates the processor 701 and the memory 702 independently of each other. The processor 701 and the memory 702 may also be integrated.
The memory 702 is used to store a computer program, including an operating system and program code. The Memory 702 is a variety of types of storage media, for example, the Memory 702 is a random access Memory (random access Memory, RAM), a read-only Memory (ROM), a non-volatile random access Memory (non-volatile random access Memory, NVRAM), a programmable read-only Memory (programmable read-only Memory, PROM), an erasable programmable read-only Memory (erasable programmable read-only Memory, EPROM), an electrically erasable programmable read-only Memory (electrically erasable programmable read-only Memory, EEPROM), a compact disc read-only Memory (compact disc read-only Memory, CD-ROM), a flash Memory, a register, an optical disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), a magnetic disk, or other magnetic storage device.
Wherein the processor 701 is a general-purpose processor or a special-purpose processor. A general-purpose processor is a processor that performs certain steps and/or operations by reading and executing computer programs stored in memory, which may be used by the general-purpose processor in performing the steps and/or operations described above. The computer program is for instance executed to realize the relevant functions of the aforementioned processing modules. A general purpose processor such as, but not limited to, a central processing unit (central processing unit, CPU). A special purpose processor is a specially designed processor for performing certain steps and/or operations, such as, but not limited to, a digital signal processor (digital signal processor, DSP), an application-specific integrated circuit (ASIC), a complex program logic device (complex programmable logical device, CPLD), a field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL), or any combination thereof. The processor 701 may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. The processor 701 includes at least one circuit to perform all or part of the steps of the above embodiments to provide event processing methods.
Wherein the network interface 704 is used for the event processing apparatus 700 to communicate with other devices. Network interface 704 includes physical interfaces and logical interfaces. The physical interface may be Gigabit Ethernet (GE) for implementing interconnection of the event processing apparatus 700 with other devices, and the logical interface is an interface inside the event processing apparatus 700 for implementing interconnection of devices inside the event processing apparatus 700. It is to be readily appreciated that the network interface 704 may be used for the event processing apparatus 700 to communicate with other devices, for example, the network interface 704 may be used for sending and receiving messages between the event processing apparatus 700 and other devices, and the network interface 704 may implement the functions associated with the foregoing receiving module and sending module.
The input/output device 705 includes an input/output (I/O) interface, a device such as a keyboard, a mouse, a display, etc. connected to the event processing apparatus 700 through the I/O interface, and a device such as a display connected to the processor 701 through a bus, and the processor 701 can receive an input command or data through the input/output device 705 and output the processed data. For example, the input output device 705 may include a display that can be used to display intermediate and/or final results, etc., produced by the processor 701 performing the event processing methods described above.
The bus 703 is any type of communication bus that is used to interconnect the internal devices of the event processing apparatus 700. Such as a system bus. In the embodiment of the present application, the devices inside the event processing apparatus 700 are interconnected through the bus 703 as an example, and the devices inside the event processing apparatus 700 are connected to each other by other connection manners, for example, the devices inside the event processing apparatus 700 are interconnected through a logic interface inside the event processing apparatus 700.
The above devices may be provided on separate chips, or may be provided at least partially or entirely on the same chip. Whether the individual devices are independently disposed on different chips or integrally disposed on one or more chips is often dependent on the needs of the product design. The embodiment of the application does not limit the specific implementation form of the device. The event processing apparatus 700 shown in fig. 7 is merely exemplary, and in implementation, the event processing apparatus 700 may include other components, which are not listed here. The event processing apparatus 700 shown in fig. 7 may process a network event by performing all or part of the steps of the event processing method provided in the above embodiments to ensure that a service operates normally.
The embodiment of the application provides an event processing system, which comprises a resource manager and a network manager. The resource manager includes the event processing device 500 as shown in fig. 5, and the network manager includes the event processing device 600 as shown in fig. 6. Alternatively, at least one of the resource manager and the network manager includes an event processing device 700 as shown in FIG. 7.
Alternatively, the network manager and the resource manager are two separate devices. For example, the network manager and the resource manager are two independent servers. Alternatively, the network manager and the resource manager are different components in a single device. For example, the network manager and the resource manager are different components in one server.
By way of example, the event processing system is shown in fig. 1 or fig. 2.
The present embodiments provide a computer readable storage medium having stored therein a computer program which, when executed (e.g., by a network manager, resource manager, one or more processors, etc.), performs all or part of the steps of a method as provided by the method embodiments described above.
The present embodiments provide a computer program product comprising a program or code that, when executed (e.g., by a network manager, resource manager, one or more processors, etc.), performs all or part of the steps of a method as provided by the method embodiments described above.
Embodiments of the present application provide a chip comprising programmable logic circuits and/or program instructions, which when executed is configured to implement all or part of the steps of the method provided by the method embodiments described above.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be embodied in whole or in part in the form of a computer program product comprising one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a network of computers, or other programmable devices. The computer instructions may be stored in or transmitted from one computer readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means from one website, computer, server, or data center. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device including one or more servers, data centers, etc. that can be integrated with the available medium. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium, or a semiconductor medium (e.g., solid state disk), etc.
It should be understood that the term "at least one" in this application refers to one or more, and "a plurality" refers to two or more. In the present application, the symbol "/" generally means or unless otherwise indicated, for example, a/B may represent a or B. The term "and/or" in this application is merely an association relation describing an associated object, meaning that three relations may exist, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, for purposes of clarity of description, the words "first," "second," "third," and the like are used throughout this application to distinguish between identical or similar items that have substantially the same function and effect. Those skilled in the art will appreciate that the words "first," "second," "third," etc. do not limit the number and order of execution.
Different types of embodiments, such as a method embodiment and a device embodiment, provided in the embodiments of the present application may be mutually referred to, and the embodiments of the present application are not limited to this. The sequence of the operations of the method embodiment provided in the embodiment of the present application can be appropriately adjusted, the operations can also be increased or decreased according to the situation, and any method that is easily conceivable to be changed by a person skilled in the art within the technical scope of the present application is covered in the protection scope of the present application, so that no further description is provided.
In the corresponding embodiments provided in the present application, it should be understood that the disclosed apparatus and the like may be implemented by other structural manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or connection shown or discussed with respect to each other may be through some interface, indirect coupling or connection of devices or elements, electrical or otherwise.
The elements illustrated as separate elements may or may not be physically separate, and elements described as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network nodes. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
While the invention has been described with reference to exemplary embodiments thereof, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made without departing from the spirit and scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (28)

1. A method of event processing, the method comprising:
the resource manager receives a first message sent by a network manager, wherein the network manager is used for managing a communication network;
the resource manager determines a first server according to the first message, wherein the first server is a server which is possibly influenced by a first event occurring in the communication network in servers accessed to the communication network, and the resource manager is used for managing the first server;
the resource manager enforces the first server-related event handling policy.
2. The method of claim 1, wherein the event processing policy comprises at least one of:
event marking, service migration, backup service enablement, and alerting.
3. The method of claim 1 or 2, wherein the event handling policy comprises an event marker, and wherein the resource manager executes the first server-related event handling policy comprising:
the resource manager marks the event on the first server.
4. A method according to any one of claims 1 to 3, wherein the event handling policy comprises traffic migration and the resource manager implements the first server-related event handling policy comprising:
The resource manager migrates a first service carried by the first server to a second server, the second server being managed by the resource manager, the second server being unaffected by the first event.
5. The method of any of claims 1 to 4, wherein the event handling policy comprises backup service enablement, and wherein the resource manager implements the first server-related event handling policy comprising:
the resource manager enables a backup service of a second service, the second service being carried by the first server, the backup service being carried by a third server managed by the resource manager, the third server being unaffected by the first event.
6. The method of any of claims 1 to 5, wherein the first message comprises an indication of the first server.
7. The method according to any of claims 1 to 5, wherein the first message comprises device indication information indicating a device in the communication network where the first event occurred.
8. The method of claim 7, wherein the resource manager determining the first server from the first message comprises:
The resource manager determines equipment in the communication network, in which the first event occurs, according to the equipment indication information;
the resource manager determines the first server from a device in the communication network where the first event occurred.
9. The method according to any one of claims 6 to 8, wherein,
the first message further includes at least one of:
event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event;
the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
10. The method according to any one of claims 1 to 9, further comprising:
the resource manager receives a second message sent by the network manager;
the resource manager determines from the second message that the first server is not affected by the first event.
11. The method according to claim 9, wherein the method further comprises:
the resource manager releases the event handling policy associated with the first server.
12. The method according to any one of claims 1 to 11, wherein,
the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
13. The method according to any one of claim 1 to 12, wherein,
the network manager and the resource manager are two independent devices; or,
the network manager and the resource manager are different components in a device.
14. A method of event processing, the method comprising:
a network manager determines that a first event occurs in a communication network, the network manager being configured to manage the communication network;
the network manager sends a first message to a resource manager, the first message being used by the resource manager to determine a first server and execute an event handling policy associated with the first server, the first server being a server of servers accessing the communication network that may be affected by the first event, the resource manager being used to manage the first server.
15. The method of claim 14, wherein the event processing policy comprises at least one of:
Event marking, service migration, backup service enablement, and alerting.
16. The method according to claim 14 or 15, wherein the first message comprises an indication of the first server.
17. The method of claim 16, wherein the method further comprises:
the network manager determining a device in the communication network in which the first event occurred;
the network manager determines the first server from a device in the communication network where the first event occurred.
18. The method according to claim 16 or 17, wherein the first message comprises device indication information indicating a device in the communication network where the first event occurred.
19. The method according to any one of claims 16 to 18, wherein,
the first message further includes at least one of:
event type information for indicating an event type of the first event;
interface indication information for indicating an interface in the communication network that may be affected by the first event;
the network card indication information is used for indicating the network card which may be affected by the first event in the first server.
20. The method according to any one of claims 14 to 19, further comprising:
the network manager determining that the communication network dismisses the first event;
the network manager sends a second message to the resource manager, the second message being for the resource manager to determine that the first server is not affected by the first event.
21. The method according to any one of claims 14 to 20, wherein,
the event type of the first event includes one of: network faults and network indexes can not meet the requirements.
22. The method according to any one of claims 14 to 21, wherein,
the network manager and the resource manager are two independent devices; or,
the network manager and the resource manager are different components in a device.
23. An event processing device, characterized by being applied to a resource manager, comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute a computer program stored in the memory to cause the event processing device to execute the event processing method according to any one of claims 1 to 13.
24. An event processing device, characterized by being applied to a network manager, comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute a computer program stored in the memory to cause the event processing device to perform the event processing method of any of claims 14 to 22.
25. An event handling system comprising a resource manager comprising the event handling device of claim 23 and a network manager comprising the event handling device of claim 24.
26. The system of claim 25, wherein the system further comprises a controller configured to control the controller,
the network manager and the resource manager are two independent devices; or,
the network manager and the resource manager are different components in a device.
27. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed, implements the event processing method of any of claims 1 to 22.
28. A computer program product, characterized in that it comprises a program or code which, when executed, implements the event processing method of any of claims 1 to 22.
CN202211345438.4A 2022-09-08 2022-10-31 Event processing method, device and system Pending CN117675505A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/100793 WO2024051258A1 (en) 2022-09-08 2023-06-16 Event processing method, apparatus and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2022110936559 2022-09-08
CN202211093655 2022-09-08

Publications (1)

Publication Number Publication Date
CN117675505A true CN117675505A (en) 2024-03-08

Family

ID=90068822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211345438.4A Pending CN117675505A (en) 2022-09-08 2022-10-31 Event processing method, device and system

Country Status (2)

Country Link
CN (1) CN117675505A (en)
WO (1) WO2024051258A1 (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7478404B1 (en) * 2004-03-30 2009-01-13 Emc Corporation System and methods for event impact analysis
US10599505B1 (en) * 2017-11-20 2020-03-24 Amazon Technologies, Inc. Event handling system with escalation suppression
US10761926B2 (en) * 2018-08-13 2020-09-01 Quanta Computer Inc. Server hardware fault analysis and recovery
CN113206814B (en) * 2020-01-31 2022-11-18 华为技术有限公司 Network event processing method and device and readable storage medium
CN114244683A (en) * 2020-09-07 2022-03-25 华为技术有限公司 Event classification method and device
CN112491805B (en) * 2020-11-04 2023-07-28 深圳供电局有限公司 Network security equipment management system applied to cloud platform
CN113821367B (en) * 2021-09-23 2024-02-02 中国建设银行股份有限公司 Method and related device for determining influence range of fault equipment
CN113986478A (en) * 2021-09-26 2022-01-28 阿里巴巴(中国)有限公司 Resource migration strategy determination method and device

Also Published As

Publication number Publication date
WO2024051258A1 (en) 2024-03-14

Similar Documents

Publication Publication Date Title
US8270306B2 (en) Fault management apparatus and method for identifying cause of fault in communication network
CN104468181B (en) The detection and processing of virtual network device failure
US10313380B2 (en) System and method for centralized virtual interface card driver logging in a network environment
US10318335B1 (en) Self-managed virtual networks and services
US20080215910A1 (en) High-Availability Networking with Intelligent Failover
CN102299846B (en) Method for transmitting BFD (Bidirectional Forwarding Detection) message and equipment
US9491043B2 (en) Communication path switching device, communication path switching method and communication path switching program
CN104113428B (en) A kind of equipment management device and method
US8976681B2 (en) Network system, network management server, and OAM test method
US9384102B2 (en) Redundant, fault-tolerant management fabric for multipartition servers
CN109960634A (en) A kind of method for monitoring application program, apparatus and system
CN112737871B (en) Link fault detection method and device, computer equipment and storage medium
CN112291116A (en) Link fault detection method and device and network equipment
US20090116395A1 (en) Communication apparatus and method
CN114371912A (en) Virtual network management method of data center and data center system
JP4724763B2 (en) Packet processing apparatus and interface unit
CN113364678A (en) Data transmission system, method, device, electronic equipment and computer readable medium
WO2016117302A1 (en) Information processing device, information processing method, and recording medium
CN109964450B (en) Method and device for determining shared risk link group
CN114124803B (en) Device management method and device, electronic device and storage medium
CN117675505A (en) Event processing method, device and system
Lee et al. Fault localization in NFV framework
Han et al. Computer network failure and solution
CN108282383A (en) A kind of method and apparatus for realizing troubleshooting
JP5653947B2 (en) Network management system, network management device, network management method, and network management program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination