CN108322318B - Alarm analysis method and equipment - Google Patents

Alarm analysis method and equipment Download PDF

Info

Publication number
CN108322318B
CN108322318B CN201710033521.0A CN201710033521A CN108322318B CN 108322318 B CN108322318 B CN 108322318B CN 201710033521 A CN201710033521 A CN 201710033521A CN 108322318 B CN108322318 B CN 108322318B
Authority
CN
China
Prior art keywords
node
alarm
access
nodes
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710033521.0A
Other languages
Chinese (zh)
Other versions
CN108322318A (en
Inventor
石苏龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201710033521.0A priority Critical patent/CN108322318B/en
Publication of CN108322318A publication Critical patent/CN108322318A/en
Application granted granted Critical
Publication of CN108322318B publication Critical patent/CN108322318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention provides an alarm analysis method and equipment, which are applied to a network management system, wherein the network management system comprises a server and a plurality of nodes, and the method comprises the following steps: the method comprises the steps that a server obtains an alarm reported by at least one alarm node, wherein the alarm node is a node generating the alarm in a plurality of nodes; the server determines a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition results of the plurality of nodes and the access node list of each alarm node; wherein the topology decomposition result comprises a priority ranking of the alarms generated by the plurality of nodes as root cause alarms, the priority ranking being determined according to the out-degree of each of the plurality of nodes. The embodiment of the invention improves the efficiency of the alarm root cause analysis and reduces the implementation difficulty of the alarm root cause analysis.

Description

Alarm analysis method and equipment
Technical Field
The embodiment of the invention relates to the technical field of network management, in particular to an alarm analysis method and equipment.
Background
Faults often occur in the information system, fault points are often located through a fault alarm mode for the convenience of management of the information system, but the scale and the complexity of the information system are exponentially increased along with the increase of equipment in the information system.
In practical application, a network management system receives a large amount of alarm information from an information system, and only a small amount of alarm information in the large amount of alarm information is root cause alarm information, but technicians are difficult to find the root cause alarm information in the large amount of alarm information. In order to find the root cause alarm information, the prior art is implemented based on alarm root cause analysis of system state monitoring. Specifically, after the network element reports the alarm information to the network management system, the network management system finds system abnormal points such as link abnormality, abnormal devices, and the like by comparing the system states before and after the fault, and then determines the root cause alarm by the corresponding relationship between the abnormal points and the reported alarm. For example, if the network element a reports an alarm 1, the network element B reports an alarm 2, and the network management system finds that the system abnormal points are the network element a and the network element C by comparing the system states before and after the fault, it is determined that the alarm 1 corresponding to the network element a is a root cause alarm.
However, the alarm root cause analysis based on system state monitoring needs to fully monitor the state of the whole system, which causes high system overhead and high implementation difficulty; after the alarm, it also takes a long time to compare the system states before and after the fault to find out abnormal points of the system, so that the efficiency of the alarm root cause analysis is low.
Disclosure of Invention
The embodiment of the invention provides an alarm analysis method and equipment, which are used for reducing the system overhead in the alarm analysis process, improving the alarm root cause analysis efficiency and reducing the implementation difficulty of the alarm root cause analysis.
In a first aspect, an embodiment of the present invention provides an alarm analysis method, which is applied to a network management system, where the network management system includes a server and multiple nodes, and the method includes:
the method comprises the steps that a server obtains an alarm reported by at least one alarm node, and the server determines a target node corresponding to a root cause alarm in the at least one alarm node according to topology decomposition results of a plurality of nodes and an access node list of each alarm node;
the alarm node is a node generating an alarm among the plurality of nodes, the topology decomposition result includes a priority ranking of the alarms generated by the plurality of nodes as root cause alarms, the priority ranking is determined according to the out-degree of each node in the plurality of nodes, and optionally, the access node list includes an identifier of each node and an identifier of an access node corresponding to each node. The identification of the access node includes an identification of a direct access node and an identification of an indirect access node.
The server determines a target node corresponding to the root cause alarm in at least one alarm node through a pre-established topology decomposition result of a plurality of nodes and an access node list of each alarm node, the state of the whole system does not need to be monitored, the overhead of the system is reduced, the system state does not need to be compared, the alarm root cause analysis efficiency is improved, and the implementation difficulty of the alarm root cause analysis is reduced.
In one possible design, the determining, by the server, a target node corresponding to a root cause alarm in the at least one alarm node according to the topology decomposition result of the plurality of nodes and the access node list of each alarm node includes:
the server sequentially excludes a target alarm node and a target access alarm node from the alarm nodes according to the priority ranking and the access node list of each alarm node, wherein the target alarm node is the alarm node with the highest alarm priority in the current rest alarm nodes; aiming at the initial condition, the current residual alarm nodes are all alarm nodes, and after the elimination operation is executed, the current residual alarm nodes are the alarm nodes which are remained after the target alarm node is eliminated and the target access alarm node;
the target access alarm node is an access node of the target alarm node, and the target access alarm node is an alarm node generating an alarm;
and the server determines a target node corresponding to the root cause alarm according to the excluded target alarm node.
By eliminating the target alarm node and the target access alarm node, the alarm caused by the propagation of the target alarm node in the communication topology is eliminated, so that the target node corresponding to the root cause alarm can be quickly positioned, and the acquisition efficiency of the root cause alarm is improved.
In one possible design, the server sequentially excludes the target alarm node and the target access alarm node from the alarm nodes according to the priority ranking and the access node list of each alarm node, including:
the server establishes an alarm node set, wherein the alarm node set comprises at least one alarm node, and the target alarm node and the target access alarm node are alarm nodes in the alarm node set;
the server determines a first alarm node with the highest alarm priority in the alarm node set and carries out marking processing on the first alarm node; executing deletion operation on the alarm node set according to the access node list of the first alarm node, wherein the first alarm node and the first access alarm node are deleted nodes, the first alarm node is a target alarm node in the alarm node set, the first access alarm node is a target access alarm node in the alarm node set, and the first access alarm node is an access node of the first alarm node;
the server determines a second alarm node with the highest alarm priority in the rest nodes in the alarm node set, and marks the second alarm node; according to the access node list of the second alarm node, executing deletion operation on the alarm node set; the second alarm node and the second access alarm node are deleted nodes, the second alarm node is a target alarm node in the alarm node set, the second access alarm node is a target access alarm node in the alarm node set, and the second access alarm node is an access node of the second alarm node;
repeating the processes of marking and deleting the rest nodes in the alarm node set until the alarm node set is empty;
the server determines a target node corresponding to the root cause alarm according to the excluded target alarm node, and the method comprises the following steps:
and the server obtains a target node corresponding to the root cause alarm according to the alarm node which is marked and processed.
In one possible design, a server determines a first alarm node with the highest alarm priority in an alarm node set, and marks the first alarm node in the alarm node set; executing deletion operation on an alarm node set according to an access node list of a first alarm node, wherein the first access alarm node is a deleted node, the first alarm node is a target alarm node in the alarm node set, the first access alarm node is a target access alarm node in the alarm node set, and the first access alarm node is an access node of the first alarm node;
the server determines a second alarm node with the highest alarm priority in the rest nodes except the first alarm node in the alarm node set, and marks the second alarm node in the alarm node set; executing deletion operation on the alarm node set according to an access node list of a second alarm node, wherein the second access alarm node is a deleted node, the second alarm node is a target alarm node in the alarm node set, the second access alarm node is a target access alarm node in the alarm node set, and the second access alarm node is an access node of the second alarm node;
repeating the processes of marking and deleting the rest nodes except the marked node in the alarm node set until only the marked alarm node exists in the alarm node set; and the server obtains a target node corresponding to the root cause alarm according to the marked alarm node. The marking process of this embodiment may be to hide and/or lock the alarm nodes in the alarm node set to prevent the marked alarm nodes from being confirmed with priority again, and/or to prevent the marked alarm nodes from being deleted by mistake.
The implementation process of excluding the access alarm nodes can be performed orderly by excluding the access alarm nodes from the alarm node set, and the target alarm nodes are marked, so that the final target nodes corresponding to the alarm can be identified, and the correctness of the target nodes is ensured.
In a possible design, the obtaining, by the server, an alarm reported by at least one alarm node includes:
and the server acquires the alarm reported by the at least one alarm node within a preset time range.
In a possible design, before the server determines, according to the topology decomposition result of the plurality of nodes and the access node list of each alarm node, a target node corresponding to a root cause alarm in the at least one alarm node, the method further includes:
the server acquires access relation data reported by each node, wherein the access relation data comprises an identifier of an access node and an identifier of an accessed node; wherein an access node in the access relationship data is a direct access node of the accessed node;
the server obtains a first communication topology according to the access relation data;
and the server obtains the topology decomposition result and the access node list according to the out-degree of each node in the first communication topology.
Each node only needs to report the access relation data to the server, and the server can acquire the first communication topology without other equipment such as a gateway, so that the process of acquiring the first communication topology is easy to realize.
In one possible design, the obtaining, by the server, the first communication topology according to the access relationship data includes:
the server establishes an initial communication topology according to the access relation data;
and the server compresses a plurality of nodes on the same loop in the initial communication topology into a virtual node to obtain the first communication topology.
By compressing a plurality of nodes on the same loop in the initial communication topology into one virtual node, the communication topology where the communication loop exists can also be applied to the alarm analysis method provided by the embodiment, so that the application range of the embodiment is expanded, and the embodiment can cope with various communication topologies.
In one possible design, the obtaining, by the server, the topology decomposition result and the access node list according to an out-degree of each node in the first communication topology includes:
the server carries out moving-out processing on the node with the out-degree of 0 in the first communication topology to obtain a second communication topology and a first moving-out node;
the server carries out moving-out processing on the node with the out-degree of 0 in the second communication topology to obtain a third communication topology and a second moving-out node;
repeating the process of moving out the node with the out degree of 0 until all the nodes are moved out;
and the server obtains the topology decomposition result and the access node list according to the moved nodes each time.
In one possible design, the obtaining, by the server, the topology decomposition result according to the node moved each time includes:
the server obtains the topology decomposition result according to the moving-out processing sequence corresponding to each moving-out node;
the priority of shifted-out nodes in the same shifting-out processing process is the same, the priority of an Nth shifted-out node is greater than that of an N +1 th shifted-out node, and N is an integer greater than 0.
The server can obtain the alarm priority sequence of each node according to the sequence of the shift-out processing corresponding to each shift-out node, the server is simple in flow, complex operation is not needed, and the burden of the server is reduced.
In one possible design, the obtaining, by the server, the access node list according to the node removed each time includes:
when a first node with zero out degree is moved out, the server determines a second node corresponding to the first node, wherein the second node is a direct access node of the first node;
the server stores the second node into an access node list corresponding to the first node;
the server judges whether the first node exists in an access node list of a third node or not, and if so, the second node is stored in the access node list of the third node;
the server sequentially traverses all the removed nodes according to the moving-out sequence of each node to obtain the access node list;
if the access node list comprises the virtual nodes, the access node list comprises all original nodes corresponding to the virtual nodes.
The server can obtain an access node list according to the moving-out sequence of the nodes with zero out-degree, the access node list not only comprises direct access nodes, but also comprises indirect access nodes, so that the server can eliminate the alarm of topology propagation caused by the nodes according to the access node list, the accuracy of alarm analysis is improved, the server implementation process is simple, complex operation is not needed, and the load of the server is reduced.
In a second aspect, an embodiment of the present invention provides a server, which is applied to a network management system, where the network management system includes the server and a plurality of nodes, and the method includes:
the alarm acquisition module is used for acquiring an alarm reported by at least one alarm node, wherein the alarm node is a node generating the alarm in the plurality of nodes;
the alarm determining module is used for determining a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition results of the nodes and the access node list of each alarm node;
wherein the topology decomposition result comprises a priority ranking of the alarms generated by the plurality of nodes as root cause alarms, the priority ranking being determined according to the out-degree of each of the plurality of nodes.
In one possible design, the alarm determination module is specifically configured to:
according to the priority ranking and the access node list of each alarm node, sequentially excluding a target alarm node and a target access alarm node from the alarm nodes, wherein the target alarm node is the alarm node with the highest alarm priority in the remaining alarm nodes, the target access alarm node is the access node of the target alarm node, and the target access alarm node is the alarm node generating the alarm;
and determining the target node corresponding to the root cause alarm according to the excluded target alarm node.
In one possible design, the alarm determination module is specifically configured to:
establishing an alarm node set, wherein the alarm node set comprises at least one alarm node, and the target alarm node and the target access alarm node are alarm nodes in the alarm node set;
determining a first alarm node with the highest alarm priority in the alarm node set, and marking the first alarm node; executing deletion operation on the alarm node set according to the access node list of the first alarm node, wherein the first alarm node and the first access alarm node are deleted nodes, the first alarm node is a target alarm node in the alarm node set, and the first access alarm node is a target access alarm node in the alarm node set;
determining a second alarm node with the highest alarm priority in the rest nodes in the alarm node set, and marking the second alarm node; according to the access node list of the second alarm node, executing deletion operation on the alarm node set; the second alarm node and the second access alarm node are deleted nodes, the second alarm node is a target alarm node in the alarm node set, and the second access alarm node is a target access alarm node in the alarm node set;
repeating the processes of marking and deleting the rest nodes in the alarm node set until the alarm node set is empty;
and obtaining a target node corresponding to the root cause alarm according to the alarm node subjected to the marking processing.
In one possible design, the alarm obtaining module is specifically configured to:
and acquiring the alarm reported by the at least one alarm node within a preset time range.
In one possible design, further comprising: the system comprises a relation acquisition module, a relation processing module and a generation module; wherein
The relationship acquisition module is used for acquiring access relationship data reported by each node, wherein the access relationship data comprises an identifier of an access node and an identifier of an accessed node;
the relationship processing module is used for obtaining a first communication topology according to the access relationship data;
the generating module is configured to obtain the topology decomposition result and the access node list according to the out-degree of each node in the first communication topology.
In one possible design, the relationship processing module is specifically configured to:
establishing an initial communication topology according to the access relation data;
and compressing a plurality of nodes on the same loop in the initial communication topology into a virtual node to establish the first communication topology.
In one possible design, the generating module is specifically configured to:
carrying out shift-out processing on the node with the out degree of 0 in the first communication topology to obtain a second communication topology and a first shift-out node;
carrying out moving-out processing on the node with the out degree of 0 in the second communication topology to obtain a third communication topology and a second moving-out node;
repeating the process of moving out the node with the out degree of 0 until all the nodes are moved out;
and obtaining the topology decomposition result and the access node list according to the moved nodes each time.
In one possible design, the generating module is specifically configured to:
obtaining the topology decomposition result according to the moving-out processing sequence corresponding to each moving-out node;
the priority of shifted-out nodes in the same shifting-out processing process is the same, the priority of an Nth shifted-out node is greater than that of an N +1 th shifted-out node, and N is an integer greater than 0.
In one possible design, the generating module is specifically configured to:
determining a second node corresponding to a first node every time the first node with zero out degree is moved out, wherein the second node is a direct access node of the first node;
storing the second node into an access node list corresponding to the first node;
judging whether the first node exists in an access node list of a third node, if so, storing the second node in the access node list of the third node;
according to the shifting-out sequence of each node, sequentially traversing all the removed nodes in sequence to obtain the access node list;
if the access node list comprises the virtual nodes, the access node list comprises all original nodes corresponding to the virtual nodes.
In a third aspect, an embodiment of the present invention provides a server, including: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executes computer-executable instructions stored by the memory to cause the server to perform the alarm analysis methods provided by the various possible designs described above.
The embodiment of the present invention further provides a computer-readable storage medium, in which computer-executable instructions are stored, and when at least one processor of the server executes the computer-executable instructions, the server executes the alarm analysis methods provided by the above various possible designs.
Also provided in an embodiment of the present invention is a computer program product including computer executable instructions stored in a computer readable storage medium. The computer executable instructions may be read by at least one processor of the server from a computer readable storage medium, and executed by the at least one processor to cause the server to implement the alarm analysis methods provided by the various possible designs described above.
Drawings
Fig. 1 is a schematic structural diagram of a network management system according to an embodiment of the present invention;
fig. 2 is a first schematic flow chart of an alarm analysis method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an alarm provided by an embodiment of the present invention;
fig. 4 is a schematic diagram of a network topology of a telecommunication system according to an embodiment of the present invention;
fig. 5 is a second flowchart of an alarm analysis method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a first communication topology according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a second communication topology according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a third communication topology provided in the embodiment of the present invention;
fig. 9 is a schematic structural diagram of a fourth communication topology according to an embodiment of the present invention;
fig. 10 is a third schematic flowchart of an alarm analysis method according to an embodiment of the present invention;
fig. 11 is a first schematic alarm diagram of a telecommunication system according to an embodiment of the present invention;
fig. 12 is a second schematic alarm diagram of a telecommunication system according to an embodiment of the present invention;
fig. 13 is a schematic structural diagram of a server according to an embodiment of the present invention;
fig. 14 is a schematic diagram of a hardware structure of a server according to an embodiment of the present invention.
Detailed Description
Fig. 1 is a schematic structural diagram of a network management system according to an embodiment of the present invention. The network management system provided by the embodiment comprises a server and a plurality of nodes. There are accesses and accessed relationships between multiple nodes. The access relations between the nodes form a network topology. The nodes provided by this embodiment may be various network elements. In fig. 1, an arrow between two nodes "→" represents an access relationship, e.g., node E → node B, representing that node E accesses node B. For example, node E may send a service request to node B, or node E invokes an interface of node B, or the like, or node E reads data from node B, or the like. Wherein, the node B is the visited node, and the node E is the direct access node of the node B. For node C → node D → node E, it means that node D accesses node E and node C accesses node D. In this embodiment, the node D is an access node of the node E, the node D has an access process to the node E, the node C is an indirect access node of the node E, and the node C has no access process to the node E. In the embodiments described below, the access nodes involved include direct access nodes and indirect access nodes, if not specifically stated.
In this embodiment, each node reports access relationship data to the server periodically, where the access relationship data includes an identifier of the access node and an identifier of the accessed node. Wherein an access node in the access relationship data is a direct access node of the accessed node; the server can establish an access node list and a topology decomposition result of each node according to the access relation data. The topology decomposition result comprises the alarms generated by the nodes as the priority sequence of the root cause alarms. The access node list includes the identification of each node and the identification of the corresponding access node. The specific setup procedure will be described in detail in the following examples.
When the node is abnormal, the node reports an alarm to the server. As will be appreciated by those skilled in the art, alarms are propagated along the communication topology. For example, in the embodiment shown in fig. 1, when the node B and the node G alarm, the node B is the root cause alarm, that is, the alarm of the node B is the trigger alarm of the node G. If a node is accessed only and other nodes are not accessed, when an alarm is generated on the node, the root of the alarm can be the generated alarm 'self'. Based on the method, the server can determine the node corresponding to the root cause alarm in the nodes reporting the alarm according to the pre-established access node list and the topology decomposition result, so that the implementation difficulty of the alarm root cause analysis is reduced while the alarm root cause analysis efficiency is improved. The following describes an embodiment of the present invention in detail with reference to fig. 2 and 3.
Fig. 2 is a first flowchart of an alarm analysis method according to an embodiment of the present invention, and fig. 3 is a first flowchart of an alarm according to an embodiment of the present invention. As shown in fig. 2, the method includes:
s201, a server acquires an alarm reported by at least one alarm node, wherein the alarm node is a node generating the alarm in the plurality of nodes;
referring to fig. 1, when an alarm is generated at node B, node G, and node H among the nodes in fig. 1, for convenience of description, the node generating the alarm is labeled in fig. 3, and the node generating the alarm is referred to as an alarm node. As shown in fig. 3, node B generates alarm 1, node G generates alarm 2, node H generates alarm 3, and node F generates alarm 4. The alarm nodes, namely the node B, the node G, the node H and the node F report alarms to the server respectively, and the server acquires the alarms reported by the alarm nodes.
As will be understood by those skilled in the art, the server may obtain the alarm reported by at least one alarm node within a preset time period. Specifically, the preset time period may be preset by the server. For example, the time period preset by the server may be a half hour period, that is, every half hour, the server counts the alarms reported by the alarm node in the time period. The indication of the preset time period may be 8:31-8:32, 8:32-8:33, 8:33-8:34, etc. For example, after 8:32 is reached, the server counts the alarms reported by 8:31-8:32, that is, the alarms reported within 60 seconds corresponding to 8:31, so as to obtain the alarms reported by at least one alarm node.
S202, the server determines a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition results of the nodes and the access node list of each alarm node;
the access node list includes identifiers of the nodes and identifiers of access nodes corresponding to the nodes, the topology decomposition result includes a priority ranking of alarms generated by the nodes as root cause alarms, and the priority ranking is determined according to the degree of departure of each node in the nodes.
In this embodiment, the access node list includes an identifier of each node and an identifier of an access node corresponding to each node. The identification of the access node includes an identification of a direct access node and an identification of an indirect access node. For example, in fig. 3, the access node of node B includes not only node E but also node H, node G, node D, node C, and node a. The access node list may be a list pre-established by the server according to the communication topology. One possible ordering of the list of access nodes is shown for the network topology shown in fig. 1.
Watch 1
Node point Access node list
Node A Node D, node C, and node H
Node B Node E, node H, node G, node D, node C, and node A
Node C
Node D Node C
Node E Node H, node G, node D, node C, node A
Node F
Node G A node H,Node D
Node H
Node I Node H
One skilled in the art will appreciate that table one gives one possible implementation of the access node list. Other implementations are possible in the specific implementation. For example, each node may also correspond to its own access node list. For example, table two takes node a as an example, and one possible implementation form is given. The embodiment is not particularly limited in this regard to a specific implementation manner of the access node list.
Watch two
Figure BDA0001210913630000081
In this embodiment, the topology decomposition result includes the priority ranking of the alarms generated by the plurality of nodes as root cause alarms, and is determined by the server according to the out-degree of each node. The out degree refers to the number of edges pointing to other nodes starting from a certain node. The communication topology can be seen as a directed graph, in which out-degree refers to the number of other nodes that a node directly accesses starting from the node. For example, in fig. 3, node D has an out-degree of 3. The server can carry out topology decomposition according to the out degree of each node in the communication topology, and a topology decomposition result is established in advance. In a specific decomposition process, the node with the out-degree of 0 is moved out, wherein the priority of the moved-out node is the highest, namely the node B, the node F and the node I with the out-degree of 0 are moved out first, the priority of the moved-out node is the highest, then the node with the out-degree of 0 is continuously moved out in the rest communication topology, the priority of the moved-out node is the highest, and the process is repeated until the number of the nodes in the communication topology is 0. Table three shows one possible implementation of prioritization for the network topology shown in fig. 1.
Watch III
Node point Priority level Node point Priority level
Node B 1 Node G 3
Node F 1 Node D 4
Node I 1 Node H 4
Node E 2 Node C 5
Node A 3
In the priority ranking shown in table three, the priorities are embodied in the form of numbers, and the smaller the number, the higher the priority. In the specific implementation process, the specific implementation manner of the priority is not particularly limited in this embodiment, as long as the priority ranking of each node can be expressed.
The server can determine a target node corresponding to the root cause alarm in at least one alarm node according to a pre-established topology decomposition result of a plurality of nodes and an access node list of each alarm node. Specifically, the server sequentially excludes a target alarm node with high priority and a target access alarm node from the alarm nodes according to priority sequencing and an access node list of each alarm node, wherein the target alarm node is the alarm node with the highest alarm priority in the current rest alarm nodes, the target access alarm node is the access node of the target alarm node, the target access alarm node is the alarm node generating the alarm, and the server determines the target node corresponding to the root cause alarm according to the excluded target alarm node.
For example, node B generates alarm 1, node G generates alarm 2, node H generates alarm 3, and node F generates alarm 4. As can be seen from table three, if the priority of the alarm is highest as the root cause, the node B and the access alarm node corresponding to the node B are excluded from the alarm nodes according to table one, that is, the node B, the node H, and the node G are excluded. At this time, the remaining node F, among the remaining alarm nodes, is the alarm node with the highest priority as node F, and there are no other alarm nodes, so that the final root alarm is the alarm generated by node B and node F. Those skilled in the art can understand that if the alarm node with the highest priority is the node F and there are other alarm nodes, the access alarm node corresponding to the node F is removed according to the table, and so on until there is no node left.
In the alarm analysis method provided by this embodiment, after the server obtains the alarm reported by at least one alarm node, the server determines the target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition results of the plurality of nodes established in advance and the access node list of each alarm node, without monitoring the state of the entire system, thereby reducing the overhead of the system, and the root cause alarm can be quickly determined only according to the topology decomposition results established in advance and the access node list, without comparing the system state, thereby improving the alarm root cause analysis efficiency and reducing the implementation difficulty of the alarm root cause analysis.
The following takes a network management system, specifically a telecommunication system, as an example, and combines the network topology shown in fig. 4 to describe in detail the alarm analysis method provided in the embodiment of the present invention. Fig. 4 is a schematic diagram of a network topology of a telecommunication system according to an embodiment of the present invention. As shown in fig. 4, a communication topology between several nodes is schematically shown, where the nodes shown in fig. 4 include a Set Top Box (Set Top Box, STB), a Media Entertainment Management Middleware (MEM), a Unified Management System (UMS), an integrated Bus (Integration Bus, IB), a Media Delivery Network (MDN), a Near Video on Demand (nvserver), a carousel television Server (Near Video on Demand, CA), a File Transfer Protocol (File Transfer Protocol, FTP) Server, a Fast Channel switch (Fast Channel switch, FCC), a Digital Subscriber Line Access Multiplexer (DSLAM), a Network Time Protocol (NTP) Server.
The above only schematically illustrates several implementations of nodes in a telecommunication system, and the present embodiment is not particularly limited herein for other implementations of nodes.
The following first explains the specific process of establishing the topology decomposition result and the access node list.
Fig. 5 is a second flowchart of an alarm analysis method according to an embodiment of the present invention. As shown in fig. 5, the method includes:
s501, the server obtains access relation data reported by each node, wherein the access relation data comprise an access node identifier and an accessed node identifier.
The nodes are responsible for extracting call relation data related to the nodes, and each node can report access relation data periodically, wherein the access relation data comprises an access node identifier and an accessed node identifier.
In one possible implementation, the basic call relation data may be represented by a binary group, i.e. (accessing party ), and may actually contain other additional information, e.g. (accessing party, additional information 1, additional information 2, …, additional information n).
The method for extracting the node-to-call relation data is diversified, the existing method can be referred, and a more effective method can be designed by combining the system characteristics, and the adoption of the specific method is not taken as the limitation of the embodiment of the invention.
A simple example is given below in which access relationship data is extracted based on network state information: considering two nodes a and B in the system, assuming that the node B accesses the service in the node a, it can be found from the network status information of the node a that there is a connection between the IP of the node B and the service listening port of the node a. In the above scenario, node a may extract a binary group representing an access relationship from its network state information, where the accessing party is represented by the IP address of node B and the accessing party is represented by the IP address of node a, i.e. (the IP address of node B, the IP address of node a). And further obtaining an access relation binary group between the nodes according to the corresponding relation between the IP and the nodes, namely (node B, node A). The correspondence between the IP and the node generally exists in the management system as basic information of the node, or may be established based on the basic information of the node in the management system.
Optionally, after the server acquires the access relationship data reported by each node, the server performs statistics, summarization and filtering on the access relationship data. The purpose of statistical summarization is to integrate data with the same access party and the same accessed party and remove repeated items; the filtering aims to eliminate call relation data which are generated by heartbeat, disaster recovery and the like and are irrelevant to service access.
For the network topology shown in fig. 4, after statistics, summarization and filtering, the remaining access relationship data constitutes the service access relationship of the system. Assume that the business access relationships of the system are as shown in table four.
Watch four
Access party Visited party Access party Visited party
STB CA IB UMS
STB MDN IB MDN
STB MEM UMS IB
STB NTP UMS MEM
STB DSLAM MDN CA
MEM CA MDN FCC
MEM IB MDN NVODServer
IB CA NVOD Server MDN
IB MEM CA FTP
S502, the server obtains a first communication topology according to the access relation data.
The server can obtain the initial communication topology shown in fig. 4 according to the access relationship data shown in table four. In fig. 4, in the initial communication topology, a loop exists, and therefore, the server needs to establish the first communication topology by compressing a plurality of nodes on the same loop in the initial communication topology as one virtual node.
It can be understood by those skilled in the art that if the obtained initial communication topology is as shown in fig. 1, and there is no loop, the initial communication topology is the first communication topology, and no compression process is needed.
With continued reference to fig. 4, in the communication topology shown in fig. 4, there are two loops that require compression processing, MDN → NVOD Server → MDN and UMS → MEM → IB → UMS. Specifically, the MDN and the NVOD Server are connected with each other to form a loop, and two nodes are compressed into one virtual node, namely the MDN _ NVOD Server. Since the NVOD Server has only a connection to the MDN, only the connection of the MDN to other nodes needs to be made as the connection of the MDN _ NVOD Server to other nodes.
Although UMS and IB, MEM and IB in fig. 4 each form a loop, they are generally one large loop, so three nodes are compressed into one virtual node MEM _ UMS _ IB. Both MEM and IB have connections to CA, one for the virtual node MEM _ UMS _ IB. Meanwhile, the connection from the STB to the MEN is converted into the connection from the STB to the MEM _ UMS _ IB, the connection from the IB to the MDN is converted into the connection from the MEM _ UMS _ IB to the MDN, and the obtained compressed communication topology is the first communication topology. Fig. 6 is a schematic structural diagram of a first communication topology according to an embodiment of the present invention.
S503, the server obtains the topology decomposition result and the access node list according to the out-degree of each node in the first communication topology.
Specifically, the server performs shift-out processing on a node with an out-degree of 0 in the first communication topology to obtain a second communication topology and a first shift-out node; the server carries out moving-out processing on the node with the out degree of 0 in the second communication topology to obtain a third communication topology and a second moving-out node; repeating the process of moving out the node with the out degree of 0 until all the nodes are moved out; and the server obtains a topology decomposition result and an access node list according to the moved nodes each time. The following describes in detail implementation processes for obtaining the topology decomposition result and the access node list in detail by using detailed embodiments.
Referring to fig. 6, the out-degree of each node is shown in table five.
Watch five
Node point Degree of delivery
STB 5
MEM_UMS_IB 2
MDN_NVODServer 2
CA 1
FTP 0
FCC 0
DSLAM 0
NTP 0
And moving out the node with the out degree of 0 from the first communication topology to obtain a second communication topology and a first moved-out node. In this process, when a node is removed, the connection associated with the node is also removed. The degree of departure of the remaining nodes may also change, requiring recalculation of the degree of departure of the remaining nodes. According to the fifth table, the out-degrees of the four nodes of FTP, FCC, DSLAM, and NTP are all 0, and the four nodes are removed from the first communication topology shown in fig. 6, so as to obtain the second communication topology and the first removed nodes of FTP, FCC, DSLAM, and NTP. Fig. 7 is a schematic structural diagram of a second communication topology provided in the embodiment of the present invention.
Those skilled in the art will appreciate that when NTP is removed, the STB to NTP connection is also removed, and the out-degree of the STB should be reduced by one according to the out-degree calculation method, as shown in figure 7. The out-degree of each node in the second communication topology shown in fig. 7 is shown in table six.
Watch six
Node point Degree of delivery
STB 3
MEM_UMS_IB 2
MDN_NVODServer 1
CA 0
As shown in table six, the out-degree of the CA is 0, so the CA node is moved out in the second communication topology, and a third communication topology and a second moved-out node CA are obtained. Fig. 8 is a schematic structural diagram of a third communication topology provided in the embodiment of the present invention. The out-degree of each node in the third communication topology shown in fig. 8 is shown in table seven.
Watch seven
Node point Degree of delivery
STB 2
MEM_UMS_IB 1
MDN_NVOD Server 0
According to the seventh table, the out-degree of the MDN _ NVOD Server is 0, so the MDN _ NVOD Server node is shifted out in the third communication topology, and the fourth communication topology and the third shift-out node MDN _ NVOD Server are obtained. Fig. 9 is a schematic structural diagram of a fourth communication topology provided in the embodiment of the present invention. The out-degree of each node in the fourth communication topology shown in fig. 9 is shown in table eight.
Table eight
Node point Degree of delivery
STB 1
MEM_UMS_IB 0
As shown in table eight, the out degree of the MEM _ UMS _ IB is 0, so that the MEM _ UMS _ IB is shifted out in the fourth communication topology, and a fifth communication topology and a fourth shift-out node MEM _ UMS _ IB are obtained. And finally, the STB node in the fifth communication topology is moved out to obtain a fifth moved-out node STB, namely, the process of moving out all the nodes is completed.
From the above, the node moved in each move-out processing process can be obtained, and specific reference can be made to table nine.
Watch nine
First shift-out node FTP、FCC、DSLAM、NTP
Second shift-out node CA
Third shift-out node MDN_NVOD Server
Fourth shift-out jointDot MEM_UMS_IB
Fifth shift-out node STB
And the server sorts the topology nodes according to the topology decomposition sequence. Specifically, a topology decomposition result is obtained according to the moving-out processing sequence corresponding to each moving-out node. Wherein, the priority of the shifted-out nodes in the same shifting-out processing process is the same. Specifically, the priority of the first time shift-out node is the highest, the priority of the second time shift-out node is the next highest, and so on. For nodes located on the ring, the priority of all nodes on the ring is equal to the priority of the corresponding virtual node. The resulting prioritization is shown in table ten.
Watch ten
Node point Serial number Node point Serial number
FCC 1 MDN 3
DSLAM 1 NVOD_Server 3
NTP 1 MEM 4
FTP 1 UMS 4
CA 2 IB 4
STB 5
Wherein, the smaller the sequence number, the higher the corresponding priority, thereby obtaining the topology decomposition result.
Optionally, in the process of moving out the node, the access node list is generated at the same time. Fig. 10 is a third schematic flowchart of an alarm analysis method according to an embodiment of the present invention. As shown in fig. 10, the method includes:
s1001, when a first node with zero out-degree is moved out, the server determines a second node corresponding to the first node, wherein the second node is a direct access node of the first node;
s1002, the server stores the second node into an access node list corresponding to the first node;
s1003, the server judges whether the first node exists in an access node list of the third node, if so, S1004 is executed, and if not, S1005 is executed;
s1004, the server stores the second node into an access node list of a third node;
s1005, the server sequentially traverses all the removed nodes according to the moving-out sequence of each node to obtain an access node list.
And if the access node list comprises the virtual nodes, the access node list comprises all the original nodes corresponding to the virtual nodes.
Taking a specific example as an example, when the first FTP removal node is removed, the Server determines that the direct access node corresponding to the FTP is CA according to the current communication topology, stores the CA in an access node list corresponding to the FTP, then the Server determines whether the FTP exists in the access node lists of other nodes, if not, the Server continues to determine the access node of the FCC, determines that the direct access node corresponding to the FCC is MDN _ NVOD Server according to the current communication topology, stores the MDN _ NVOD Server in the access node list corresponding to the FCC, then the Server determines whether the FCC exists in the access node lists of other nodes, and so on; in this process, for example, when the MDN-NVOD Server is removed, the Server determines that the direct access node corresponding to the MDN-NVOD Server is STB and MEM _ UMS _ IB according to the current communication topology, stores the STB and MEM _ UMS _ IB into the access node list of the MDN-NVOD Server, determines that the MDN-NVOD Server is in the access node list of FCC, and stores the STB and MEM _ UMS _ IB into the access node list of FCC, and so on. Through the above process, the finally obtained access node list is shown in table eleven.
Watch eleven
Node point Access node list
STB
MEM_UMS_IB STB
MDN_NVOD Server MEM_UMS_IB,STB
CA MDN_NVODServer,MEM_UMS_IB,STB
FTP CA,MDN_NVOD Server,MEM_UMS_IB,STB
FCC MDN_NVODServer,MEM_UMS_IB,STB
DSLAM STB
NTP STB
For nodes located on the ring, the access node list of the nodes on the ring is the same as the access node list of the corresponding virtual node. If the access node list of MEM _ UMS _ IB is STB, the access node lists of MEM, UMS and IB are STBs.
If the access node list of a certain node contains the virtual node, the access node list equivalent to the node contains all the original nodes corresponding to the virtual node. If the access node list of the CA comprises the MDN _ NVOD Server, the access node list equivalent to the CA comprises the MDN and the NVOD Server.
Those skilled in the art can understand that, because the nodes in the loop are compressed, the access relationship in the access node list in this embodiment is a one-way access relationship, that is, only the implementation form that the node a accesses the node B exists, and the implementation form that the node B accesses the node a does not exist.
After the topology decomposition result and the access node list are obtained, the root cause alarm can be determined. Here, the above telecommunication system is taken as an example, and a specific implementation manner of determining the root cause alarm is described again. This embodiment gives the following possible embodiments, and other similar ways are within the scope of the embodiments of the present invention.
In a possible implementation mode, a server determines a first alarm node with the highest alarm priority in an alarm node set, and marks the first alarm node; and executing deletion operation on the alarm node set according to the access node list of the first alarm node, wherein the first alarm node and the first access alarm node are deleted nodes. The first access alarm node is a node in an alarm node set, the first access alarm node is an access node of the first alarm node, the alarm node set comprises at least two alarm nodes, the server determines a second alarm node with the highest alarm priority in the rest nodes in the alarm node set, and the second alarm node is marked; executing deletion operation on the alarm node set according to the access node list of the second alarm node, wherein the second alarm node and the second access alarm node are deleted nodes, the second access alarm node is a node in the alarm node set, and the second access alarm node is an access node of the second alarm node; repeating the processes of marking and deleting the rest nodes in the alarm node set until the alarm node set is empty; and the server obtains a target node corresponding to the root cause alarm according to the alarm node which is marked and processed. The marking mode of this embodiment may be to store the alarm node in a root cause alarm list, or mark the alarm node in an initial communication topology, and the specific implementation mode of the marking is not particularly limited in this embodiment.
In another possible implementation manner, the server determines a first alarm node with the highest alarm priority in the alarm node set, and performs marking processing on the first alarm node in the alarm node set; according to the access node list of the first alarm node, executing deletion operation on the alarm node set, wherein the first access alarm node is a deleted node, the alarm node set comprises a plurality of alarm nodes, the first access alarm node is a node in the alarm node set, and the first access alarm node is an access node of the first alarm node; the server determines a second alarm node with the highest alarm priority in the rest nodes except the first alarm node in the alarm node set, and marks the second alarm node in the alarm node set; executing deletion operation on the alarm node set according to the access node list of the second alarm node, wherein the second access alarm node is a deleted node, the second access alarm node is a node in the alarm node set, and the second access alarm node is an access node of the second alarm node; repeating the processes of marking and deleting the rest nodes except the marked node in the alarm node set until only the marked alarm node exists in the alarm node set; and the server obtains a target node corresponding to the root cause alarm according to the marked alarm node. The marking process of this embodiment may be to hide and/or lock the alarm nodes in the alarm node set to prevent the marked alarm nodes from being confirmed with priority again, and/or to prevent the marked alarm nodes from being deleted by mistake.
Two specific embodiments are given below for the above-described telecommunication system.
Fig. 11 is a first schematic alarm diagram of a telecommunication system according to an embodiment of the present invention. As shown in the graph 11, there are three alarms, the alarm set is { alarm 1, alarm 2, alarm 3}, and the alarm node set generating the alarms is { FTP, MDN, STB }. According to table nine, the FTP sequence is the smallest, so alarm 1 is marked. According to the eleventh table, the access node list of the FTP is { CA, MDN _ NVODServer, MEM _ UMS _ IB, STB }, that is { CA, MDN, NVOD Server, MEM, UMS, IB, STB }, so the FTP, MDN, STB need to be deleted from the alarm node set, and simultaneously the alarm 1 (flag alarm), the alarm 2 (generated by MDN), and the alarm 3 (generated by STB) need to be deleted from the alarm set. At this time, the remaining alarm node set and the remaining alarm set are both empty, the alarm root cause analysis is finished, and finally the root cause alarm is determined to be alarm 1, namely the alarm generated by the FTP.
Fig. 12 is a schematic diagram of an alarm in the telecommunication system according to the embodiment of the present invention. As shown in graph 12, there are four alarms, the set of alarms is { alarm 1, alarm 2, alarm 3, alarm 4}, and the set of alarm nodes that generate alarms is { FCC, MDN, STB, CA }. As shown in Table nine, the FCC has the lowest sequence number, so alarm 1 is flagged. According to the eleventh table, the list of access nodes of the FCC is { MDN _ NVOD Server, MEM _ UMS _ IB, STB }, that is { MDN, NVOD Server, MEM, UMS, IB, STB }, so that the FCC, MDN, STB need to be deleted from the set of alarm nodes, and simultaneously alarm 1 (flag alarm), alarm 2 (generated by MDN), alarm 3 (generated by STB) need to be deleted from the set of alarms. At this time, the set of the remaining alarm nodes is { CA }, and the set of the remaining alarms is { alarm 4 }. It is clear that alarms 4 need to be flagged and removed from the alarm set, while the CA is removed from the set of alarm nodes. And ending the alarm root cause analysis. Finally, the root cause alarms are determined to be alarm 1 and alarm 4, namely the alarms generated by FCC and CA.
As can be understood by those skilled in the art, for a virtual node, if the virtual node is a target node corresponding to a root cause alarm, all nodes in the virtual node are root cause alarms. Therefore, when the alarm is generated for at least two nodes in the loop, either the at least two nodes are the nodes corresponding to the root cause alarm, or the alarm generated by the at least two nodes is not the root cause alarm. For example, if the MDN generates alarm 1 and the NVOD Server generates alarm 2 shown in fig. 4, either both generated alarms are determined as root cause alarms, the nodes generating root cause alarms are the MDN and NVOD Server, or both generated alarms are determined as non-root cause alarms. Other loops are similar, and the description of this embodiment is omitted here.
In summary, the embodiment of the present invention directly determines root cause alarm based on the topology decomposition result, without depending on manual intervention analysis, wherein the topology decomposition result can be obtained in advance according to the communication topology, and is directly used in alarm root cause analysis, so the analysis efficiency is very high; and the embodiment of the invention can timely finish the updating of the system communication topology according to the reported change of the calling relationship, thereby having the capability of coping with the dynamic change of the system. Meanwhile, the process of acquiring the access node list and the topology decomposition result, the implementation process of alarm root cause analysis and the like in the embodiment of the invention are simple, and only the calling relationship extraction has certain complexity, so the whole technology implementation difficulty is low. Furthermore, the resource overhead of the embodiment of the invention is relatively small, and the resource consumption of only the calling relation reporting link is obvious.
The above describes the scheme provided by the embodiment of the present invention with respect to the functions implemented by the server. It is understood that the server includes hardware structures and/or software modules for performing the respective functions in order to implement the above-described functions. The elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein may be embodied in hardware or in a combination of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present teachings.
In the embodiment of the present invention, the server may be divided into the functional modules according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware or a form of a software functional module. It should be noted that, the division of the modules in the embodiment of the present invention is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 13 is a schematic structural diagram of a server according to an embodiment of the present invention. As shown in fig. 13, the server 1300 includes: an alarm acquisition module 1301 and an alarm determination module 1302. Optionally, the system further includes a relationship obtaining module 1303, a relationship processing module 1304, and a generating module 1305.
An alarm obtaining module 1301, configured to obtain an alarm reported by at least one alarm node, where the alarm node is a node that generates an alarm among the multiple nodes;
an alarm determining module 1302, configured to determine, according to the topology decomposition result of the multiple nodes and the access node list of each alarm node, a target node corresponding to a root cause alarm in the at least one alarm node;
wherein the topology decomposition result comprises a priority ranking of the alarms generated by the plurality of nodes as root cause alarms, the priority ranking being determined according to the out-degree of each of the plurality of nodes.
Optionally, the alarm determining module 1302 is specifically configured to:
according to the priority ranking and the access node list of each alarm node, sequentially excluding a target alarm node and a target access alarm node from the alarm nodes, wherein the target alarm node is the alarm node with the highest alarm priority in the remaining alarm nodes, the target access alarm node is the access node of the target alarm node, and the target access alarm node is the alarm node generating the alarm;
and determining the target node corresponding to the root cause alarm according to the excluded target alarm node.
Optionally, the alarm determining module 1302 is specifically configured to:
establishing an alarm node set, wherein the alarm node set comprises at least one alarm node, and the target alarm node and the target access alarm node are alarm nodes in the alarm node set;
determining a first alarm node with the highest alarm priority in the alarm node set, and marking the first alarm node; executing deletion operation on the alarm node set according to the access node list of the first alarm node, wherein the first alarm node and the first access alarm node are deleted nodes, the first alarm node is a target alarm node in the alarm node set, and the first access alarm node is a target access alarm node in the alarm node set;
determining a second alarm node with the highest alarm priority in the rest nodes in the alarm node set, and marking the second alarm node; according to the access node list of the second alarm node, executing deletion operation on the alarm node set; the second alarm node and the second access alarm node are deleted nodes, the second alarm node is a target alarm node in the alarm node set, and the second access alarm node is a target access alarm node in the alarm node set;
repeating the processes of marking and deleting the rest nodes in the alarm node set until the alarm node set is empty;
and obtaining a target node corresponding to the root cause alarm according to the alarm node subjected to the marking processing.
Optionally, the alarm obtaining module 1301 is specifically configured to:
and acquiring the alarm reported by the at least one alarm node within a preset time range.
Optionally, the relationship obtaining module 1303 is configured to obtain access relationship data reported by each node, where the access relationship data includes an identifier of an access node and an identifier of an accessed node;
the relationship processing module 1304 is configured to obtain a first communication topology according to the access relationship data;
the generating module 1305 is configured to obtain the topology decomposition result and the access node list according to the out-degree of each node in the first communication topology.
Optionally, the relationship processing module 1304 is specifically configured to:
establishing an initial communication topology according to the access relation data;
and compressing a plurality of nodes on the same loop in the initial communication topology into a virtual node to establish the first communication topology.
Optionally, the generating module 1305 is specifically configured to:
carrying out shift-out processing on the node with the out degree of 0 in the first communication topology to obtain a second communication topology and a first shift-out node;
carrying out moving-out processing on the node with the out degree of 0 in the second communication topology to obtain a third communication topology and a second moving-out node;
repeating the process of moving out the node with the out degree of 0 until all the nodes are moved out;
and obtaining the topology decomposition result and the access node list according to the moved nodes each time.
Optionally, the generating module 1305 is specifically configured to
Obtaining the topology decomposition result according to the moving-out processing sequence corresponding to each moving-out node;
the priority of shifted-out nodes in the same shifting-out processing process is the same, the priority of an Nth shifted-out node is greater than that of an N +1 th shifted-out node, and N is an integer greater than 0.
Optionally, the generating module 1305 is specifically configured to:
determining a second node corresponding to a first node every time the first node with zero out degree is moved out, wherein the second node is a direct access node of the first node;
storing the second node into an access node list corresponding to the first node;
judging whether the first node exists in an access node list of a third node, if so, storing the second node in the access node list of the third node;
according to the shifting-out sequence of each node, sequentially traversing all the removed nodes in sequence to obtain the access node list;
if the access node list comprises the virtual nodes, the access node list comprises all original nodes corresponding to the virtual nodes.
The server provided in this embodiment may execute the method embodiments described above, and the implementation principle and the technical effect are similar, which are not described herein again.
In a specific implementation of the foregoing server, the alarm obtaining module and the relationship obtaining module may be implemented as communication interfaces, and the alarm determining module, the relationship processing module, and the generating module may be implemented as processors. The data and program code may be stored in a memory and executed by a processor according to corresponding program instructions.
Fig. 14 is a schematic diagram of a hardware structure of a server according to an embodiment of the present invention. As shown in fig. 14, the server 1400 provided in the present embodiment includes: at least one processor 1401, memory 1402, and a bus 1403. Optionally, a communication interface 1404 is also included. Memory 1402 stores computer-executable instructions; the at least one processor 1401 executes the computer-executable instructions stored by the memory 1402 to cause the server to perform the alarm analysis method described above with respect to fig. 1-12.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when at least one processor of the server executes the computer-executable instructions, the server executes the alarm analysis method provided by the above various possible designs.
Also provided in an embodiment of the present invention is a computer program product including computer executable instructions stored in a computer readable storage medium. The computer executable instructions may be read by at least one processor of the server from a computer readable storage medium, and execution of the computer executable instructions by the at least one processor causes the server to implement the alarm analysis method provided by the various possible designs in the foregoing method embodiments.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (15)

1. An alarm analysis method, applied to a network management system, wherein the network management system comprises a server and a plurality of nodes, the method comprising:
the method comprises the steps that a server obtains an alarm reported by at least one alarm node, wherein the alarm node is a node generating the alarm in a plurality of nodes;
the server determines a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition results of the plurality of nodes and the access node list of each alarm node;
wherein the topology decomposition result comprises a priority ranking of the alarms generated by the plurality of nodes as root cause alarms, the priority ranking being determined according to the out-degree of each of the plurality of nodes;
the server determines a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition result of the plurality of nodes and the access node list of each alarm node, and the method comprises the following steps:
the server establishes an alarm node set, wherein the alarm node set comprises the at least one alarm node;
the server determines a first alarm node with the highest alarm priority in the alarm node set and carries out marking processing on the first alarm node; executing deletion operation on the alarm node set according to the access node list of the first alarm node, wherein the first alarm node and the first access alarm node are deleted nodes, the first alarm node is a target alarm node in the alarm node set, and the first access alarm node is a target access alarm node in the alarm node set; the target alarm node is an alarm node with the highest alarm priority in the current rest alarm nodes, the target access alarm node is an access node of the target alarm node, and the target access alarm node is an alarm node generating an alarm;
the server determines a second alarm node with the highest alarm priority in the rest nodes in the alarm node set, and marks the second alarm node; according to the access node list of the second alarm node, executing deletion operation on the alarm node set; the second alarm node and the second access alarm node are deleted nodes, the second alarm node is a target alarm node in the alarm node set, and the second access alarm node is a target access alarm node in the alarm node set;
repeating the processes of marking and deleting the rest nodes in the alarm node set until the alarm node set is empty;
and the server obtains a target node corresponding to the root cause alarm according to the alarm node which is marked and processed.
2. The method of claim 1, wherein the obtaining, by the server, the alarm reported by the at least one alarm node comprises:
and the server acquires the alarm reported by the at least one alarm node within a preset time range.
3. The method according to claim 1 or 2, wherein the server, before determining a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition result of the plurality of nodes and the access node list of each alarm node, further comprises:
the server acquires access relation data reported by each node, wherein the access relation data comprise an identifier of an access node and an identifier of an accessed node, and the access node is a direct access node of the accessed node;
the server obtains a first communication topology according to the access relation data;
and the server obtains the topology decomposition result and the access node list according to the out-degree of each node in the first communication topology.
4. The method of claim 3, wherein the server obtains a first communication topology from the access relationship data, comprising:
the server establishes an initial communication topology according to the access relation data;
and the server compresses a plurality of nodes on the same loop in the initial communication topology into a virtual node to obtain the first communication topology.
5. The method of claim 4, wherein the obtaining, by the server, the topology decomposition result and the list of access nodes according to the degree of departure of each node in the first communication topology comprises:
the server carries out moving-out processing on the node with the out-degree of 0 in the first communication topology to obtain a second communication topology and a first moving-out node;
the server carries out moving-out processing on the node with the out-degree of 0 in the second communication topology to obtain a third communication topology and a second moving-out node;
repeating the process of moving out the node with the out degree of 0 until all the nodes are moved out;
and the server obtains the topology decomposition result and the access node list according to the moved nodes each time.
6. The method of claim 5, wherein the server obtains the topology decomposition result according to the node moved out each time, and comprises:
the server obtains the topology decomposition result according to the moving-out processing sequence corresponding to each moving-out node;
the priority of shifted-out nodes in the same shifting-out processing process is the same, the priority of an Nth shifted-out node is greater than that of an N +1 th shifted-out node, and N is an integer greater than 0.
7. The method of claim 5, wherein the server obtains the list of access nodes from each removed node, comprising:
when a first node with zero out degree is moved out, the server determines a second node corresponding to the first node, wherein the second node is a direct access node of the first node;
the server stores the second node into an access node list corresponding to the first node;
the server judges whether the first node exists in an access node list of a third node or not, and if so, the second node is stored in the access node list of the third node;
the server sequentially traverses all the removed nodes according to the moving-out sequence of each node to obtain the access node list;
if the access node list comprises the virtual nodes, the access node list comprises all original nodes corresponding to the virtual nodes.
8. A server, applied to a network management system including the server and a plurality of nodes, the server comprising:
the alarm acquisition module is used for acquiring an alarm reported by at least one alarm node, wherein the alarm node is a node generating the alarm in the plurality of nodes;
the alarm determining module is used for determining a target node corresponding to the root cause alarm in the at least one alarm node according to the topology decomposition results of the nodes and the access node list of each alarm node;
wherein the topology decomposition result comprises a priority ranking of the alarms generated by the plurality of nodes as root cause alarms, the priority ranking being determined according to the out-degree of each of the plurality of nodes;
the alarm determination module is specifically configured to: establishing an alarm node set, wherein the alarm node set comprises at least one alarm node;
determining a first alarm node with the highest alarm priority in the alarm node set, and marking the first alarm node; executing deletion operation on the alarm node set according to the access node list of the first alarm node, wherein the first alarm node and the first access alarm node are deleted nodes, the first alarm node is a target alarm node in the alarm node set, and the first access alarm node is a target access alarm node in the alarm node set; the target alarm node is an alarm node with the highest alarm priority in the current rest alarm nodes, the target access alarm node is an access node of the target alarm node, and the target access alarm node is an alarm node generating an alarm;
determining a second alarm node with the highest alarm priority in the rest nodes in the alarm node set, and marking the second alarm node; according to the access node list of the second alarm node, executing deletion operation on the alarm node set; the second alarm node and the second access alarm node are deleted nodes, the second alarm node is a target alarm node in the alarm node set, and the second access alarm node is a target access alarm node in the alarm node set;
repeating the processes of marking and deleting the rest nodes in the alarm node set until the alarm node set is empty;
and obtaining a target node corresponding to the root cause alarm according to the alarm node subjected to the marking processing.
9. The server according to claim 8, wherein the alarm obtaining module is specifically configured to:
and acquiring the alarm reported by the at least one alarm node within a preset time range.
10. The server according to claim 8 or 9, further comprising: the system comprises a relation acquisition module, a relation processing module and a generation module; wherein
The relationship acquisition module is used for acquiring access relationship data reported by each node, wherein the access relationship data comprises an identifier of an access node and an identifier of an accessed node;
the relationship processing module is used for obtaining a first communication topology according to the access relationship data;
the generating module is configured to obtain the topology decomposition result and the access node list according to the out-degree of each node in the first communication topology.
11. The server according to claim 10, wherein the relationship processing module is specifically configured to:
establishing an initial communication topology according to the access relation data;
and compressing a plurality of nodes on the same loop in the initial communication topology into a virtual node to establish the first communication topology.
12. The server according to claim 11, wherein the generating module is specifically configured to:
carrying out shift-out processing on the node with the out degree of 0 in the first communication topology to obtain a second communication topology and a first shift-out node;
carrying out moving-out processing on the node with the out degree of 0 in the second communication topology to obtain a third communication topology and a second moving-out node;
repeating the process of moving out the node with the out degree of 0 until all the nodes are moved out;
and obtaining the topology decomposition result and the access node list according to the moved nodes each time.
13. The server according to claim 12, wherein the generating module is specifically configured to generate
Obtaining the topology decomposition result according to the moving-out processing sequence corresponding to each moving-out node;
the priority of shifted-out nodes in the same shifting-out processing process is the same, the priority of an Nth shifted-out node is greater than that of an N +1 th shifted-out node, and N is an integer greater than 0.
14. The server according to claim 12, wherein the generating module is specifically configured to:
determining a second node corresponding to a first node every time the first node with zero out degree is moved out, wherein the second node is a direct access node of the first node;
storing the second node into an access node list corresponding to the first node;
judging whether the first node exists in an access node list of a third node, if so, storing the second node in the access node list of the third node;
according to the shifting-out sequence of each node, sequentially traversing all the removed nodes in sequence to obtain the access node list;
if the access node list comprises the virtual nodes, the access node list comprises all original nodes corresponding to the virtual nodes.
15. A server, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the server to perform the alert analysis method of any of claims 1 to 7.
CN201710033521.0A 2017-01-16 2017-01-16 Alarm analysis method and equipment Active CN108322318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710033521.0A CN108322318B (en) 2017-01-16 2017-01-16 Alarm analysis method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710033521.0A CN108322318B (en) 2017-01-16 2017-01-16 Alarm analysis method and equipment

Publications (2)

Publication Number Publication Date
CN108322318A CN108322318A (en) 2018-07-24
CN108322318B true CN108322318B (en) 2021-04-09

Family

ID=62892023

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710033521.0A Active CN108322318B (en) 2017-01-16 2017-01-16 Alarm analysis method and equipment

Country Status (1)

Country Link
CN (1) CN108322318B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109634819B (en) * 2018-10-26 2022-02-01 创新先进技术有限公司 Alarm root cause positioning method and device and electronic equipment
CN111669282B (en) * 2019-03-08 2023-10-24 华为技术有限公司 Method, device and computer storage medium for identifying suspected root cause alarm
CN110351118B (en) * 2019-05-28 2020-12-01 华为技术有限公司 Root cause alarm decision network construction method, device and storage medium
CN110535686B (en) * 2019-07-25 2021-12-31 深圳壹师城科技有限公司 Abnormal event processing method and device
CN110995482B (en) * 2019-11-27 2022-06-21 深圳市商汤科技有限公司 Alarm analysis method and device, computer equipment and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104796273A (en) * 2014-01-20 2015-07-22 中国移动通信集团山西有限公司 Method and device for diagnosing root of network faults
CN105471659A (en) * 2015-12-25 2016-04-06 华为技术有限公司 Root fault cause analysis method and analysis device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8711711B2 (en) * 2008-10-31 2014-04-29 Howard University System and method of detecting and locating intermittent and other faults
US8819220B2 (en) * 2010-09-09 2014-08-26 Hitachi, Ltd. Management method of computer system and management system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104796273A (en) * 2014-01-20 2015-07-22 中国移动通信集团山西有限公司 Method and device for diagnosing root of network faults
CN105471659A (en) * 2015-12-25 2016-04-06 华为技术有限公司 Root fault cause analysis method and analysis device

Also Published As

Publication number Publication date
CN108322318A (en) 2018-07-24

Similar Documents

Publication Publication Date Title
CN108322318B (en) Alarm analysis method and equipment
CN107332876B (en) Method and device for synchronizing block chain state
CN110995482B (en) Alarm analysis method and device, computer equipment and computer readable storage medium
CN108062243B (en) Execution plan generation method, task execution method and device
CN104754629B (en) Method and device for realizing self-healing of base station equipment
US9794113B2 (en) Network alert pattern mining
CN106055630A (en) Log storage method and device
CN109756382B (en) Fault positioning method and device
CN106878038B (en) Fault positioning method and device in communication network
EP3101841A1 (en) System and method for network management automation
CN107659505B (en) SDN network routing method and SDN controller
CN111181800A (en) Test data processing method and device, electronic equipment and storage medium
US20140006554A1 (en) System management apparatus, system management method, and storage medium
CN106488416B (en) Industry short message sending method and device
US8442947B2 (en) Management of performance data
CN111538772B (en) Data exchange processing method and device, electronic equipment and storage medium
CN106302621B (en) A kind of message informing method and equipment
CN107835097B (en) Alarm information synchronization method and device, and network element
CN115858250A (en) Data recovery method, device, storage medium and electronic device
CN113965538B (en) Equipment state message processing method, device and storage medium
CN106445784B (en) Information monitoring method and device
CN105049475A (en) High-efficiency data storage and optimization method and system for large-scale community
CN115291793A (en) Attribute data conversion method and device, storage medium and electronic device
CN106469086B (en) Event processing method and device
CN115705259A (en) Fault processing method, related device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant