CN115529219A - Alarm analysis method and device, computer readable storage medium and electronic equipment - Google Patents

Alarm analysis method and device, computer readable storage medium and electronic equipment Download PDF

Info

Publication number
CN115529219A
CN115529219A CN202211131508.6A CN202211131508A CN115529219A CN 115529219 A CN115529219 A CN 115529219A CN 202211131508 A CN202211131508 A CN 202211131508A CN 115529219 A CN115529219 A CN 115529219A
Authority
CN
China
Prior art keywords
alarm information
alarm
target
maintenance
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211131508.6A
Other languages
Chinese (zh)
Inventor
金咏诗
印凌潼
何建慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202211131508.6A priority Critical patent/CN115529219A/en
Publication of CN115529219A publication Critical patent/CN115529219A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults

Abstract

The invention discloses an alarm analysis method, an alarm analysis device, a computer readable storage medium and electronic equipment. Relating to the field of financial science and technology or other fields, the method comprises the following steps: acquiring a plurality of alarm information corresponding to a target network, and determining an alarm object corresponding to each alarm information; determining a plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and the alarm object corresponding to each alarm information; determining the degree of association between the alarm information to be processed corresponding to the at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object; and determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance object. The invention solves the technical problem of low efficiency of positioning the network fault source in the prior art.

Description

Alarm analysis method and device, computer readable storage medium and electronic equipment
Technical Field
The invention relates to the field of financial science and technology or other fields, in particular to an alarm analysis method and device, a computer-readable storage medium and electronic equipment.
Background
The network alarm has the characteristic of weak directivity due to the characteristics of the network technology. In the actual operation and maintenance process, the practice of the network technology is spatial, the fault of a single node often radiates large-scale influence from point to surface and is accompanied by multiple sources and multiple types of abnormal responses, and the possibility of upgrading into an alarm storm exists at a large probability. Only by quickly positioning the fault root node and processing the fault problem of the fault root node, the influence of the network fault on the related service can be timely restrained. However, in the prior art, the failure root cause nodes in the network are often manually checked, positioned or inspected, so that the problem of low efficiency of positioning the failure source of the network is solved.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides an alarm analysis method, an alarm analysis device, a computer readable storage medium and electronic equipment, which at least solve the technical problem of low efficiency of positioning a network fault source in the prior art.
According to an aspect of an embodiment of the present invention, an alarm analysis method is provided, including: acquiring a plurality of alarm information corresponding to a target network, and determining an alarm object corresponding to each alarm information, wherein the alarm object is an object in the target network, and the alarm information is generated when the target network is abnormal; determining a plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and the alarm object corresponding to each alarm information, wherein the operation and maintenance objects are objects in an association relationship with the alarm object in a target network, and the probability of the operation and maintenance object giving an alarm when the current alarm information is generated is greater than a preset probability; determining the degree of association between the alarm information to be processed corresponding to at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, wherein the target operation and maintenance object is an object of which the information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset value, the weight is used for representing the fault range corresponding to the operation and maintenance object, and the alarm information to be processed is the information in a plurality of alarm information; and determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance object, wherein each target alarm information set consists of a plurality of target alarm information, the target alarm information is the alarm information of which the association degree in the alarm information to be processed is greater than the preset association degree, and the root cause alarm object is the object causing the alarm phenomenon corresponding to the target alarm information set.
Further, the alarm analysis method further comprises: determining object types of a plurality of operation and maintenance objects corresponding to each alarm information from the operation and maintenance object list based on the alarm characteristics of each alarm information; and determining a plurality of operation and maintenance objects corresponding to each alarm information based on the network topology structure of the target network, the alarm object corresponding to each alarm information and the object type of the operation and maintenance object corresponding to each alarm information.
Further, the alarm analysis method further comprises: determining the generation time of each alarm information in a plurality of alarm information before determining the association degree between the alarm information to be processed corresponding to at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in at least one target operation and maintenance object; and determining the alarm information with the generation time within the same preset time range as the alarm information to be processed.
Further, the alarm analysis method further includes: determining the generation time of each alarm message in a plurality of alarm messages; and determining the alarm information with the generation time in the same preset time range as the alarm information to be processed.
Further, the alarm analysis method further comprises: determining the information quantity of the alarm information corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed; and calculating based on the information quantity and weight corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed to obtain a calculation result corresponding to each target operation and maintenance object, and determining the association degree between the alarm information to be processed in each alarm information set to be processed based on the calculation result.
Further, the alarm analysis method further includes: determining a set of to-be-processed alarm information with the association degree larger than the preset association degree as a target alarm information set, and determining the to-be-processed alarm information in the target alarm information set as target alarm information; and determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance objects according to the calculation result based on a normal distribution algorithm.
Further, the alarm analysis method further comprises: acquiring network asset information of a target network; and determining an operation and maintenance object list corresponding to the target network based on the network asset information, wherein the operation and maintenance object list is used for recording the object type of the operation and maintenance object.
According to another aspect of the embodiments of the present invention, there is also provided an alarm analyzing apparatus, including: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a plurality of alarm information corresponding to a target network and determining an alarm object corresponding to each alarm information, wherein the alarm object is an object in the target network, and the alarm information is generated when the target network is abnormal; the first determining module is used for determining a plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and the alarm object corresponding to each alarm information, wherein the operation and maintenance objects are objects in a target network and have an association relationship with the alarm object, and the probability of the operation and maintenance object giving an alarm when the current alarm information is generated is greater than a preset probability; the second determination module is used for determining the degree of association between the alarm information to be processed corresponding to the at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, wherein the target operation and maintenance object is an object in which the information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset value, the weight is used for representing the fault range corresponding to the operation and maintenance object, and the alarm information to be processed is information in a plurality of alarm information; and the third determining module is used for determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance object, wherein each target alarm information set consists of a plurality of target alarm information, the target alarm information is the alarm information of which the association degree in the alarm information to be processed is greater than the preset association degree, and the root cause alarm object is an object causing an alarm phenomenon corresponding to the target alarm information set.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute the above alarm analysis method when running.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including one or more processors; a memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement a method for running a program, wherein the program is arranged to perform the above-described alarm analysis method when run.
In the embodiment of the invention, the method of enriching the alarm information, compressing the alarm information based on the enrichment result to determine the root cause alarm object is adopted, the operation and maintenance objects corresponding to the alarm information are determined based on the alarm information and the alarm object corresponding to the alarm information, and then the association degree between the alarm information to be processed corresponding to at least one target operation and maintenance object is determined based on the weight corresponding to each target operation and maintenance object in at least one target operation and maintenance object, so that the root cause alarm object corresponding to each target alarm information set is determined from the target operation and maintenance object. The alarm object is an object in a target network, the alarm information is alarm information generated when the target network is abnormal, the target operation and maintenance object is an object of which the information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset numerical value, the weight is used for representing a fault range corresponding to the operation and maintenance object, the alarm information to be processed is information in a plurality of alarm information, each target alarm information set is composed of a plurality of target alarm information, the target alarm information is alarm information of which the association degree in the alarm information to be processed is greater than the preset association degree, and the alarm object is an object causing an alarm phenomenon corresponding to the target alarm information set.
In the process, because a single-node fault often radiates large-scale influence from a point and a line to a plane in a network, that is, when a certain node sends alarm information, other nodes having an association relationship with the node may also send alarm information, and the alarm information corresponds to the same fault node. Therefore, by determining that when each alarm message is generated, the object which may also generate the alarm message is the operation and maintenance object corresponding to the alarm object, the full mining of the associated information of the alarm message is realized, and thus, each alarm message is effectively enriched. Furthermore, the alarm information is aggregated based on the operation and maintenance objects obtained after enrichment, so that the effective aggregation of the alarm information possibly having association relation is realized, the association degree between the aggregated alarm information to be processed is calculated, the effective determination of the associated alarm information is realized, namely the compression of a plurality of alarm information is realized, and the accurate determination of the network fault source is realized by determining the root cause alarm object corresponding to the compressed alarm information.
Therefore, the scheme provided by the application achieves the purposes of enriching the alarm information and compressing the alarm information based on the enriched result to determine the root cause alarm object, thereby realizing the technical effect of improving the accuracy of positioning the network fault source and further solving the technical problem of low efficiency of positioning the network fault source in the prior art.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a system architecture diagram of an alternative alarm analysis system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an alternative alarm analysis method according to an embodiment of the present invention;
FIG. 3 is a topological diagram of an alternative generic rich model in accordance with embodiments of the present invention;
FIG. 4 is a schematic diagram of the working principle of an alternative compression model according to an embodiment of the invention;
FIG. 5 is a schematic diagram of an alternative alert analysis apparatus according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an alternative electronic device according to an embodiment of the invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the alarm analysis method, the alarm analysis device, the computer-readable storage medium, and the electronic device provided in the present disclosure may be used in the field of financial technology, and may also be used in any field other than the field of financial technology.
It should be noted that the relevant information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are information and data authorized by the user or sufficiently authorized by each party. For example, an interface is provided between the system and the relevant user or organization, before obtaining the relevant information, an obtaining request needs to be sent to the user or organization through the interface, and after receiving the consent information fed back by the user or organization, the relevant information is obtained.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of an alarm analysis method, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions, and that while a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than here.
In this embodiment, as shown in fig. 1, an optional alarm analysis system is used as an execution subject to execute the alarm analysis method, and a system architecture of the alarm analysis system may be divided into a data layer, a processing layer and a user layer, where the processing layer at least includes a generic rich model and a compression model, and the data layer may be divided into a forwarding layer, a storage layer and a base layer.
Fig. 2 is a schematic diagram of an alternative alarm analysis method according to an embodiment of the present invention, and as shown in fig. 2, the method includes the following steps:
step S201, obtaining a plurality of alarm information corresponding to the target network, and determining an alarm object corresponding to each alarm information, where the alarm object is an object in the target network, and the alarm information is alarm information generated when the target network is abnormal.
Optionally, the user layer in the alarm analysis system may include a network Manager (NET, NET-Manager) and centralized monitoring. In step S201, the network manager may obtain a plurality of alarm information corresponding to the target network, and send the alarm information to the centralized monitoring for centralized monitoring recording and displaying, and meanwhile, as shown in fig. 3, the network manager may also send the alarm information to the first cluster Kafka cluster in the storage layer shown in fig. 1, so as to implement storage of the alarm information. The alarm information may be automatically generated when the target network is abnormal, or may be generated by a monitoring means of an alarm analysis system for multidimensional active detection of the target network.
Further, after the alarm information is stored in the Kafka cluster, the alarm object corresponding to each alarm information can be determined through the general rich model. Optionally, as shown in fig. 3, the general rich model may adopt a distributed architecture that can be expanded horizontally, and the architecture may be composed of a master manager (i.e., manager a and manager B in fig. 3) and a plurality of working units (i.e., working unit a, working unit B, working unit C, and the like in fig. 3). Optionally, the manager a and the manager B may update their own state information to the second cluster Redis cluster of the storage tier shown in fig. 1 every 10s and detect the state of the other party, and when the manager a or the manager B determines that the other party satisfies the preset abnormal condition, the manager a or the manager B may automatically upgrade itself to the primary manager and determine the other party as the standby manager, as shown in fig. 3, the primary manager is the manager a, and the standby manager is the manager B.
Still further, each working unit, when operating, also periodically updates its own state information to the Redis cluster of the storage layer as shown in fig. 1, so that the manager serving as the master manager can obtain the working state of each working unit from the Redis cluster, and based on a preset task allocation rule, allocate the alarm information (i.e., the original network alarm in fig. 3) in the Kafka cluster to each working unit that normally operates, and then form a working unit set by each working unit that normally operates to process the alarm-rich task at high concurrence. The master manager can also explore each working unit every 10s and determine whether to delete an abnormal working unit or add a normal working unit from the working unit list of the tasks to be distributed. In addition, as shown in fig. 3, the master administrator may also perform information interaction with each cluster to obtain relevant database state information.
Optionally, when the work unit set executes a rich task, the work unit set may work according to a standard job program for information processing defined in advance in the information processing function corresponding to the generic rich model in fig. 1. Specifically, the working unit set may first determine an alarm object corresponding to each alarm information to implement an extraction function in the information processing function corresponding to the generic rich model in fig. 1, where an alarm object is an object in which an alarm phenomenon corresponding to the alarm information occurs, for example, when the alarm information is "port a closed (down)", the alarm object corresponding to the alarm information may be determined to be port a.
Step S202, based on each alarm information and the alarm object corresponding to each alarm information, determining a plurality of operation and maintenance objects corresponding to each alarm information, wherein the operation and maintenance objects are objects in the target network and have an association relationship with the alarm objects, and the probability of the operation and maintenance objects giving an alarm when the current alarm information is generated is greater than a preset probability.
In step S202, because a fault of a single node in the network often radiates a large-scale influence from a point and a line to a plane, that is, when a certain node sends an alarm message, other nodes having an association relationship with the node may also send alarm messages, in the process of executing a rich task, the set of working units may determine, based on an alarm characteristic of the current alarm message and an alarm object corresponding to the current alarm message, and in combination with a network topology structure of a target network, an object (i.e., the aforementioned operation and maintenance object) that may also generate an alarm message when the current alarm message occurs, so as to complete determination of an operation and maintenance object corresponding to each alarm message, that is, implement an association function in an information processing function corresponding to the general rich model as in fig. 1. For example, when the current alarm information is "port a down", it may be determined that the alarm characteristic is "port down", and it may be determined that the object type of the operation and maintenance object corresponding to the "port down" may be a slot where the port a is located, a network device where the port a is located, a port connected to the port a, a slot where the port connected to the port a is located, and the like.
Further, the working unit set may determine the specific operation and maintenance object corresponding to the current alarm information based on the network topology structure of the target network, that is, implement an integration function in the information processing functions corresponding to the general rich model in fig. 1. For example, it is determined that the slot in which the port a is located is slot a, the network device in which the port a is located is computer a, the port connected to the port a is port B, the slot in which the port B is located is slot B, and the like, and when current alarm information occurs, the determined operation and maintenance object may generate alarm information matched with its own type, for example, the alarm information generated by computer a is not necessarily "computer a down". Alternatively, after the set of working units completes the rich task, the set of working units may store the execution result of the rich task to a third cluster, an mongoDB cluster, in the storage layer as shown in FIG. 1, that is, store the alarm information and the alarm object and the operation and maintenance object corresponding to the alarm information to the MongoDB cluster. Optionally, the generic rich model further includes a process management function as shown in fig. 1, which can schedule and defend related rich tasks in the information processing process.
It should be noted that, by determining the alarm object corresponding to the alarm information, extraction of the alarm key information is achieved, and by determining, based on the extracted key information, that when each alarm information is generated, the object that may also generate the alarm information is the operation and maintenance object corresponding to the alarm object, determination of the association information of the alarm information is achieved, thereby achieving effective enrichment of each alarm information.
Step S203, determining a degree of association between the to-be-processed alarm information corresponding to the at least one target operation and maintenance object based on a weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, where the target operation and maintenance object is an object whose information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset value, the weight is used to represent a fault range corresponding to the operation and maintenance object, and the to-be-processed alarm information is information in multiple alarm information.
In step S203, after the operation and maintenance object corresponding to each alarm information is determined, the compression model may perform a compression task on all alarm information as alarm information to be processed, or the compression model may select alarm information meeting the condition from all alarm information as alarm information to be processed based on time factors, network topology factors, or other factors.
Further, the compression model may combine the alarm information to be processed of the same operation and maintenance object in the corresponding operation and maintenance object to obtain an alarm information set to be processed, or may only combine the alarm information to be processed of which the number of objects of the same operation and maintenance object in the corresponding operation and maintenance object is greater than a preset number to obtain an alarm information set to be processed, and calculate the association score thereof. The preset value may be 2, and optionally, the preset value may also be other values according to different application scenarios.
Furthermore, the compression model may determine the association score corresponding to the alarm information set to be processed based on the weight of the operation and maintenance object (i.e., the target operation and maintenance object) that is overlapped in the operation and maintenance object corresponding to each alarm information to be processed in the alarm information set to be processed, that is, the determination of the association degree between the alarm information to be processed corresponding to the overlapped operation and maintenance object is achieved. Specifically, the method for determining the association score based on the weight may be directly adding the weights to obtain the association score, may also be performing weighted summation by combining the number of coincidence times corresponding to each target operation and maintenance object to obtain the association score, and may also be performing calculation based on other mathematical models to obtain the association score. The association score is used for representing the association degree between the alarm information to be processed in the alarm information set to be processed.
It should be noted that, by aggregating each alarm information based on the operation and maintenance object corresponding to each alarm information, the effective aggregation of the alarm information that may have an association relationship is realized, and by calculating the association degree between the aggregated to-be-processed alarm information, the effective determination of the associated alarm information is realized.
Step S204, determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance objects, wherein each target alarm information set is composed of a plurality of target alarm information, the target alarm information is the alarm information of which the association degree in the alarm information to be processed is greater than the preset association degree, and the root cause alarm object is the object causing the alarm phenomenon corresponding to the target alarm information set.
In step S204, the compression model may determine a set of to-be-processed alarm information corresponding to the association score greater than the preset score as a target alarm information set, and determine the to-be-processed alarm information in the set of to-be-processed alarm information as the target alarm information. Optionally, the compression model may determine the target operation and maintenance object with the largest number of target alarm information corresponding to the target alarm information set as the root cause alarm object, or calculate a correlation score by combining weights of the target operation and maintenance objects on the basis of the foregoing, and determine the target operation and maintenance object with the highest score as the root cause alarm object, or further analyze each correlation score by using the mathematical model after obtaining the correlation score, so as to determine the root cause alarm object based on the analyzed correlation score.
It should be noted that, when there is an association relationship between alarm information, the alarm information may be generated based on the same fault, and therefore, by determining a root cause alarm object corresponding to each target alarm information set based on a target operation and maintenance object, accurate determination of a network fault source is achieved.
Based on the schemes defined in the above steps S201 to S204, it can be known that, in the embodiment of the present invention, a manner of performing enrichment processing on alarm information, then compressing a plurality of alarm information based on an enrichment result to determine a root cause alarm object is adopted, a plurality of alarm information corresponding to a target network is obtained, an alarm object corresponding to each alarm information is determined, then a plurality of operation and maintenance objects corresponding to each alarm information are determined based on each alarm information and the alarm object corresponding to each alarm information, and then a degree of association between alarm information to be processed corresponding to at least one target operation and maintenance object is determined based on a weight corresponding to each target operation and maintenance object in at least one target operation and maintenance object, so that the root cause alarm object corresponding to each target alarm information set is determined from the target operation and maintenance objects. The alarm object is an object in a target network, the alarm information is alarm information generated when the target network is abnormal, the target operation and maintenance object is an object of which the information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset numerical value, the weight is used for representing a fault range corresponding to the operation and maintenance object, the alarm information to be processed is information in a plurality of alarm information, each target alarm information set is composed of a plurality of target alarm information, the target alarm information is alarm information of which the association degree in the alarm information to be processed is greater than the preset association degree, and the alarm object is an object causing an alarm phenomenon corresponding to the target alarm information set.
It is easy to note that, in the above process, because a single node fault in the network often radiates a large-scale influence from a point and a line to a plane, that is, when a certain node sends an alarm message, other nodes having an association relationship with the certain node may also send alarm messages, and the multiple alarm messages correspond to the same fault node. Therefore, by determining that the object which is also likely to generate the alarm information is the operation and maintenance object corresponding to the alarm object when each alarm information is generated, the full mining of the association information of the alarm information is realized, and the effective enrichment of each alarm information is realized. Furthermore, the alarm information is aggregated based on the operation and maintenance objects obtained after enrichment, so that the effective aggregation of the alarm information possibly having association relation is realized, the association degree between the aggregated alarm information to be processed is calculated, the effective determination of the associated alarm information is realized, namely the compression of a plurality of alarm information is realized, and the accurate determination of the network fault source is realized by determining the root cause alarm object corresponding to the compressed alarm information.
Therefore, the scheme provided by the application achieves the purposes of enriching the alarm information and compressing the alarm information based on the enriched result to determine the root cause alarm object, thereby realizing the technical effect of improving the accuracy of positioning the network fault source and further solving the technical problem of low efficiency of positioning the network fault source in the prior art.
In an alternative embodiment, before determining the plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and the alarm object corresponding to each alarm information, the alarm analysis system may predefine a standard operation procedure of information processing to implement a logical closed loop for information extraction, association, and integration for any alarm information. Optionally, the alarm analysis system may obtain network asset information of the target network, and then determine an operation and maintenance object list corresponding to the target network based on the network asset information, where the operation and maintenance object list is used to record an object type of the operation and maintenance object.
Specifically, the alarm analysis system may obtain data such as network asset information, network topology, network device configuration, and line configuration of the target network in advance through the auxiliary module of the processing layer shown in fig. 1, and store the data in the data center of the base layer shown in fig. 1, and the auxiliary module may implement timing acquisition and full/incremental update of the data according to characteristics of the data, and also include a verification identification model to verify validity of the data. Meanwhile, the basic layer can also be used for storing relevant data of network alarm, so that the business data related in the application can be stored. In addition, as shown in fig. 1, the base layer may further store a related log and configuration data of the alarm analysis system during the working process, where the configuration data may include related data of the operation and maintenance object, related data of the extraction scheme, and a predefined canonical network element hierarchy.
Further, the alarm analysis system may determine, based on data such as network asset information, an operation and maintenance object list corresponding to the target network, that is, determine which objects in the target network may be defined as operation and maintenance objects.
It should be noted that, by determining the operation and maintenance object list, effective screening of objects in the target network is achieved, so that efficiency and accuracy of the operation and maintenance object corresponding to the alarm object can be improved, and the problem of large processing capacity caused by determining an invalid object as the operation and maintenance object is avoided.
In an optional embodiment, in the process of determining the multiple operation and maintenance objects corresponding to each alarm information based on each alarm information and the alarm object corresponding to each alarm information, the alarm analysis system may determine, from the operation and maintenance object list, object types of the multiple operation and maintenance objects corresponding to each alarm information based on an alarm characteristic of each alarm information, and then determine the multiple operation and maintenance objects corresponding to each alarm information based on a network topology of the target network, the alarm object corresponding to each alarm information, and the object type of the operation and maintenance object corresponding to each alarm information.
Optionally, after the operation and maintenance object list corresponding to the target network is obtained, the general rich model may set the weight of the operation and maintenance object according to the fault range corresponding to the operation and maintenance object. For example, when the operation and maintenance object is a computer, the fault node causing the operation and maintenance object to have a fault includes a slot, a port, a socket, and the like, then the fault range corresponding to the operation and maintenance object is large, and a lower weight may be set for the fault node.
Further, the general rich model may determine object types of multiple operation and maintenance objects corresponding to a certain alarm information from the operation and maintenance object list based on an alarm characteristic of the alarm information and an alarm object according to historical alarm analysis data, that is, determine operation and maintenance objects possibly related to the alarm information, thereby determining the object type of the operation and maintenance object corresponding to each alarm information according to the foregoing method. Then, the general rich model may determine a plurality of specific operation and maintenance objects corresponding to a certain alarm information based on data such as a network topology structure, a network asset device configuration, network asset information, and the like of a target network, an object type of an operation and maintenance object corresponding to the certain alarm information, and an alarm object corresponding to the certain alarm information, so that the determination of the specific operation and maintenance object corresponding to each alarm information may be achieved according to the foregoing method.
For example, if the operation and maintenance object list includes a port, a slot, and a computer, and one piece of alarm information represents "port a down", the object types of the multiple operation and maintenance objects corresponding to the alarm information may be determined to be a physical port corresponding to the port a, a slot where the port a is located, network equipment where the port a is located, a port connected to the port a, a slot where the port connected to the port a is located, network equipment where the port connected to the port a is located, and the like, from the operation and maintenance object list according to the alarm feature "port down" and the alarm object "port a" of the alarm information. Further, if it is determined that the physical port corresponding to the port a is A1, the slot in which the port a is located is slot a, the network device in which the port a is located is computer a, the port connected to the port a is port B, the slot in which the port B is located is slot B, the network device in which the port B is located is computer B, or the like, based on the network topology structure of the target network, the generic rich model may determine, according to the object types described above, that the multiple operation and maintenance objects corresponding to the alarm information are physical port A1, slot a, computer a, logical port B, the physical port corresponding to logical port B, slot B, computer B, a connection between the port a and the port B, or the like.
It should be noted that, the general enrichment model can achieve about 60.9% enrichment of the operation and maintenance object of the alarm information, and the average time overhead required for enriching the operation and maintenance object corresponding to a single alarm information is within 1 second. In addition, because the general enrichment model is realized based on a distributed architecture, 2000 pieces of alarm information can be simultaneously enriched in parallel within 1 minute to meet the processing requirements under the large-scale alarm of the network, and the transverse expansion can be carried out according to the future service characteristics.
In an optional embodiment, before determining the association degree between the to-be-processed alarm information corresponding to the at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, as shown in fig. 1 and 4, the alarm analysis system may determine the generation time of each alarm information in the plurality of alarm information by using a time sequence association algorithm, and then determine the alarm information whose generation time is within the same preset time range as the to-be-processed alarm information. Specifically, in this embodiment, as shown in fig. 4, the preset time range may be 1 minute, that is, the compression model may be triggered every 10 seconds, so as to mine the enriched alarm information within 1 minute from the current time as the alarm information to be processed. For example, in fig. 4, each small window represents the alarm information generated within 10s, when the current time is T, the alarm information A1-Z1, A2-Z2, A3-Z3, A4-Z4 may be determined as the alarm information to be processed, and when the current time is T +10s, the alarm information A2-Z2, A3-Z3, A4-Z4, A5-Z5 may be determined as the alarm information to be processed.
It should be noted that, when a node fails, other nodes that may be radiated may also fail in a similar time, so that to-be-processed alarm information is screened out from alarm information based on a time factor, thereby implementing preliminary aggregation of alarm information that may be generated by the same failed node failure, and improving working efficiency.
In an optional embodiment, in the process of determining the association degree between the to-be-processed alarm information corresponding to the at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, the alarm analysis system may split the plurality of to-be-processed alarm information into at least one to-be-processed alarm information set, and determine the association degree between the to-be-processed alarm information in each to-be-processed alarm information set based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object. And the same target operation and maintenance object exists in at least one operation and maintenance object corresponding to each alarm information to be processed in the alarm information set to be processed.
Optionally, the compression model may combine the to-be-processed alarms of the same operation and maintenance object in the corresponding operation and maintenance object to obtain a to-be-processed alarm information set, for example, the to-be-processed alarm information a corresponds to the operation and maintenance object a and the operation and maintenance object B, the to-be-processed alarm information B corresponds to the operation and maintenance object B, the to-be-processed alarm information C corresponds to the operation and maintenance object a and the operation and maintenance object B, and the to-be-processed alarm information D corresponds to the operation and maintenance object E, so that it may be determined that the operation and maintenance object B exists in the operation and maintenance objects corresponding to the to-be-processed alarm information a, the to-be-processed alarm information B, and the to-be-processed alarm information C are combined into one to-be-processed alarm information set. In this embodiment, the preset value is 2, that is, the operation and maintenance object a and the operation and maintenance object B may be determined as the target operation and maintenance object.
Furthermore, the compression model may determine the association score corresponding to the alarm information set to be processed based on the weight of the operation and maintenance object (i.e., the target operation and maintenance object) that is overlapped in the operation and maintenance object corresponding to each alarm information to be processed in the alarm information set to be processed, that is, the determination of the association degree between the alarm information to be processed corresponding to the overlapped operation and maintenance object is achieved. And the association score is used for representing the association degree between the alarm information to be processed in the alarm information set to be processed. For example, for the to-be-processed alarm information set composed of the to-be-processed alarm information a, the to-be-processed alarm information B, and the to-be-processed alarm information C, the overlapped operation and maintenance object includes the operation and maintenance object B and the operation and maintenance object a, the compression model may determine the association score of the to-be-processed alarm information set based on the weight of the operation and maintenance object B and the weight of the operation and maintenance object a, thereby also achieving determination of the association degree between the to-be-processed alarm information corresponding to the operation and maintenance object B and the operation and maintenance object a.
It should be noted that, by splitting the plurality of pieces of alarm information to be processed into at least one set of alarm information to be processed based on the operation and maintenance object corresponding to each piece of alarm information to be processed, the preliminary clustering of each piece of alarm information to be processed is realized, thereby facilitating the improvement of the alarm analysis efficiency.
In an optional embodiment, in the process of determining the association degree between the alarm information to be processed in each alarm information set to be processed based on the weight corresponding to each target operation and maintenance object in at least one target operation and maintenance object, the alarm analysis system may determine the information quantity of the alarm information corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed, and then perform calculation based on the information quantity and weight corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed to obtain a calculation result corresponding to each target operation and maintenance object, so that the association degree between the alarm information to be processed in each alarm information set to be processed is determined based on the calculation result.
Optionally, the association degree between the alarm information to be processed in each set of alarm information to be processed is determined by taking the set of alarm information to be processed including the alarm information to be processed a, the alarm information to be processed B, and the alarm information to be processed C as an example. Specifically, the to-be-processed alarm information corresponding to the operation and maintenance object a is the to-be-processed alarm information a and the to-be-processed alarm information C, and the to-be-processed alarm information corresponding to the operation and maintenance object B is the to-be-processed alarm information a, the to-be-processed alarm information B and the to-be-processed alarm information C, so that the compression model can determine that the information quantity corresponding to the operation and maintenance object a is 2 and the information quantity corresponding to the operation and maintenance object B is 3.
Further, the compression model may multiply the information amount corresponding to each target operation and maintenance object by the weight corresponding to the target object to determine the association sub-score (i.e., the aforementioned calculation result) corresponding to the target operation and maintenance object, for example, if the weight of the operation and maintenance object a is 0.3 and the weight of the operation and maintenance object B is 0.5, the association sub-score of the operation and maintenance object a is 0.6 and the association sub-score of the operation and maintenance object B is 1.5.
Still further, the compression model may add the relevance sub-scores of all the target operation and maintenance objects corresponding to a certain set of to-be-processed alarm information to obtain the relevance score, and may compare the relevance score with a target threshold value, so that when it is determined that the relevance score is greater than or equal to a preset score, it is determined that the relevance degree of each to-be-processed alarm information in the set of to-be-processed alarm information is greater than the preset relevance degree, and conversely, when it is determined that the relevance score is less than the preset score, it is determined that the relevance degree of each to-be-processed alarm information in the set of to-be-processed alarm information is less than the preset relevance degree. The target threshold may be a preset value, or may be a threshold that is dynamically updated based on a learning result and is learned based on a relevant learning model in the process of processing the historical data by the compression model.
It should be noted that, by determining the information quantity and the weight corresponding to the target operation and maintenance object corresponding to each set of alarm information to be processed, the association degree between the alarm information to be processed in each set of alarm information to be processed can be determined more accurately.
In an optional embodiment, in the process of determining the root cause alarm object corresponding to each target alarm information set from the target operation and maintenance object, the alarm analysis system may determine that the alarm information set to be processed, of which the association degree is greater than the preset association degree, is the target alarm information set, determine the alarm information to be processed in the target alarm information set as the target alarm information, and then determine the root cause alarm object corresponding to each target alarm information set from the target operation and maintenance object according to the calculation result based on a normal distribution algorithm.
Optionally, when there is a unique highest score in the association sub-scores of the target operation and maintenance objects corresponding to the target alarm information set, the target operation and maintenance object corresponding to the highest score may be directly used as the root cause alarm object corresponding to the target alarm information set. On the contrary, if there are at least two highest scores in the association sub-scores of the target operation and maintenance objects corresponding to the target alarm information set, the compression model may select a unique association sub-score from the association sub-scores based on a normal distribution algorithm, and use the target operation and maintenance object corresponding to the selected association sub-score as the root cause alarm object corresponding to the target alarm information set.
Specifically, the compression model may draw the relevance sub-scores of the target operation and maintenance objects corresponding to the target alarm information set in an x-y coordinate system in the form of a histogram, where a single column in the histogram corresponds to the target operation and maintenance object one to one, the height of the column is used to represent the relevance sub-scores of the target operation and maintenance object, the width of each column is assumed to be consistent, and the columns are symmetrically arranged according to a principle that the columns are from the middle to two sides, for example, if there is a relevance sub-score of 1,2,3 at present, the arrangement order of the column graphs corresponding to the relevance sub-scores in the x-y coordinate system from left to right may be 1,2,3,2 or 2,3,2,1.
Further, after the above drawing is completed, the compression model may fit a corresponding normal distribution bell-shaped curve in the x-y coordinate system according to the corresponding columns of each target operation and maintenance object. For example, the midpoints of the top edges of the histograms are connected to determine a normal distribution bell curve. Then, the compression model may determine, based on an area formed between the normal distribution bell-shaped curve and the x-axis, a horizontal-axis interval (μ -3 σ, μ +3 σ) in the x-y coordinate system by using a principle of a normal distribution algorithm, and determine a size of a coincidence region of a column corresponding to each highest associated sub-score with a target region within the (μ -3 σ, μ +3 σ) interval, wherein the target region characterizes a region enclosed between the normal distribution bell-shaped curve and the x-axis. And then determining the target operation and maintenance object corresponding to the column with the largest overlapping area as a root cause alarm object corresponding to the target alarm information set.
Furthermore, if the only root cause alarm object cannot be selected based on the above horizontal-axis interval (μ -3 σ, μ +3 σ), the compression model may reduce the horizontal-axis interval from (μ -3 σ, μ +3 σ) to (μ -2 σ, μ +2 σ) by using the principle of the normal distribution algorithm, and re-determine according to the above method, and if the only root cause alarm object cannot be selected, the horizontal-axis interval may be reduced from (μ -2 σ, μ +2 σ) to (μ - σ, μ + σ), and re-determined according to the above method.
Still further, if a unique root cause alarm object cannot be selected based on the above horizontal axis interval (μ - σ, μ + σ), the target operation and maintenance object with the highest weight among the target operation and maintenance objects corresponding to the associated sub-score is taken as the root cause alarm object.
It should be noted that, by determining the root cause alarm object based on the normal distribution algorithm and each calculation result, the root cause alarm object can be determined more accurately, and in application, about 22.4% of alarm information can be matched to the associated alarms of the same result.
In an optional embodiment, as shown in fig. 1, the processing layer of the alarm analysis system further includes a monitoring model auxiliary module, and in the process of executing the method provided by the present application by the alarm analysis system, the monitoring model may monitor each execution module, specifically, the monitoring module may implement functions of storm suppression, monitoring of alarm full-flow processing, and the like, and meanwhile, the monitoring model may also be used to implement a mutual detection and automatic switching function between the master manager and the slave manager, a detection function of the working unit, and a data warehouse detection function.
It should be noted that, in the present application, a network element concept of an "operation and maintenance object" is proposed based on logic combing of an existing network environment, and relevant operation and maintenance information of an actual alarm is enriched by combining an existing network device configuration and various service data with an "operation and maintenance object" of a network alarm defined by specifications, so that extraction of the "operation and maintenance object" of the alarm is realized, and further, research on production alarms is converted into research on the "operation and maintenance object". And then, an operation and maintenance object is aggregated by using a related time sequence correlation algorithm, so that the effect of related alarm compression in adjacent time is achieved, and finally the improvement of the network alarm directivity is realized to help to accurately position a fault source. In the application process, the average processing time of the whole process (the closed loop process from acquisition, enrichment and association to uploading of the alarm information) of the alarm information in the method provided by the application is about 20 seconds, the processing efficiency of the alarm information is effectively improved, and it needs to be explained that the enrichment and compression process of the alarm information and the processing process of the alarm information by the network manager are processed in parallel and are uploaded to centralized monitoring by the network manager in a unified manner, namely, the method provided by the application cannot cause delay in the process of directly uploading the alarm information to the centralized monitoring.
Therefore, the scheme provided by the application achieves the purposes of enriching the alarm information and compressing a plurality of alarm information based on the enriched result to determine the root cause alarm object, thereby realizing the technical effect of improving the accuracy of positioning the network fault source and further solving the technical problem of low efficiency of positioning the network fault source in the prior art.
Example 2
According to an embodiment of the present invention, an embodiment of an alarm analysis device is provided, where fig. 5 is a schematic diagram of an alternative alarm analysis device according to an embodiment of the present invention, as shown in fig. 5, the device includes:
an obtaining module 501, configured to obtain multiple pieces of alarm information corresponding to a target network, and determine an alarm object corresponding to each piece of alarm information, where the alarm object is an object in the target network, and the alarm information is alarm information generated when the target network is abnormal;
a first determining module 502, configured to determine, based on each alarm information and an alarm object corresponding to each alarm information, a plurality of operation and maintenance objects corresponding to each alarm information, where an operation and maintenance object is an object in a target network and has an association relationship with the alarm object, and a probability that an alarm occurs in the operation and maintenance object when current alarm information is generated is greater than a preset probability;
a second determining module 503, configured to determine, based on a weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, a degree of association between to-be-processed alarm information corresponding to the at least one target operation and maintenance object, where the target operation and maintenance object is an object whose information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset value, the weight is used to represent a fault range corresponding to the operation and maintenance object, and the to-be-processed alarm information is information in multiple alarm information;
a third determining module 504, configured to determine, from the target operation and maintenance object, a root cause alarm object corresponding to each target alarm information set, where each target alarm information set is composed of multiple target alarm information, the target alarm information is alarm information whose association degree in the to-be-processed alarm information is greater than a preset association degree, and the root cause alarm object is an object that causes an alarm phenomenon corresponding to the target alarm information set.
It should be noted that the obtaining module 501, the first determining module 502, the second determining module 503, and the third determining module 504 correspond to steps S201 to S204 in the foregoing embodiment, and the four modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the foregoing embodiment 1.
Optionally, the first determining module includes: the first determining submodule is used for determining the object types of a plurality of operation and maintenance objects corresponding to each alarm information from the operation and maintenance object list based on the alarm characteristics of each alarm information; and the second determining submodule is used for determining a plurality of operation and maintenance objects corresponding to each alarm information based on the network topology structure of the target network, the alarm object corresponding to each alarm information and the object type of the operation and maintenance object corresponding to each alarm information.
Optionally, the alarm analysis device further includes: the fourth determining module is used for determining the generation time of each alarm message in the plurality of alarm messages; and the fifth determining module is used for determining the alarm information of which the generation time is in the same preset time range as the alarm information to be processed.
Optionally, the second determining module further includes: the system comprises a splitting module, a judging module and a processing module, wherein the splitting module is used for splitting a plurality of alarm information to be processed into at least one alarm information set to be processed, and the same target operation and maintenance object exists in at least one operation and maintenance object corresponding to each alarm information to be processed in the alarm information set to be processed; and the third determining sub-module is used for determining the association degree between the alarm information to be processed in each alarm information set to be processed based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object.
Optionally, the third determining sub-module further includes: the fourth determining sub-module is used for determining the information quantity of the alarm information corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed; the fifth determining submodule is used for calculating based on the information quantity and the weight corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed to obtain a calculation result corresponding to each target operation and maintenance object, and the sixth determining submodule is used for determining the association degree between the alarm information sets to be processed in each alarm information set to be processed.
Optionally, the third determining module further includes: the seventh determining submodule is used for determining the alarm information set to be processed with the association degree larger than the preset association degree as a target alarm information set and determining the alarm information to be processed in the target alarm information set as target alarm information; and the eighth determining sub-module is used for determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance objects according to the calculation result based on the normal distribution algorithm.
Optionally, the alarm analysis device further includes: the acquisition submodule is used for acquiring network asset information of a target network; and the eighth determining submodule is used for determining an operation and maintenance object list corresponding to the target network based on the network asset information, wherein the operation and maintenance object list is used for recording the object type of the operation and maintenance object.
Example 3
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute the above-mentioned alarm analysis method when running.
Example 4
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, where fig. 6 is a schematic diagram of an alternative electronic device according to the embodiments of the present invention, and as shown in fig. 6, the electronic device includes one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method for running a program, wherein the program is arranged to perform the above-mentioned alarm analysis method when running.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described in detail in a certain embodiment.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit may be a division of a logic function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a plurality of modifications and embellishments can be made without departing from the principle of the present invention, and these modifications and embellishments should also be regarded as the protection scope of the present invention.

Claims (10)

1. An alarm analysis method, comprising:
acquiring a plurality of alarm information corresponding to a target network, and determining an alarm object corresponding to each alarm information, wherein the alarm object is an object in the target network, and the alarm information is generated when the target network is abnormal;
determining a plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and an alarm object corresponding to each alarm information, wherein the operation and maintenance objects are objects in the target network and have an association relationship with the alarm objects, and the probability of the operation and maintenance objects giving an alarm when the current alarm information is generated is greater than a preset probability;
determining the degree of association between to-be-processed alarm information corresponding to at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, wherein the target operation and maintenance object is an object of which the information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset numerical value, the weight is used for representing the fault range corresponding to the operation and maintenance object, and the to-be-processed alarm information is information in the alarm information;
and determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance objects, wherein each target alarm information set is composed of a plurality of target alarm information, the target alarm information is alarm information of which the association degree in the alarm information to be processed is greater than a preset association degree, and the root cause alarm object is an object causing an alarm phenomenon corresponding to the target alarm information set.
2. The method of claim 1, wherein determining a plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and an alarm object corresponding to each alarm information comprises:
determining object types of a plurality of operation and maintenance objects corresponding to each alarm information from an operation and maintenance object list based on the alarm characteristics of each alarm information;
and determining a plurality of operation and maintenance objects corresponding to each alarm information based on the network topology structure of the target network, the alarm object corresponding to each alarm information and the object type of the operation and maintenance object corresponding to each alarm information.
3. The method according to claim 1, wherein before determining the degree of association between the to-be-processed alarm information corresponding to the at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, the method further comprises:
determining the generation time of each of the plurality of alarm information;
and determining the alarm information with the generation time within the same preset time range as the alarm information to be processed.
4. The method according to any one of claims 1 to 3, wherein determining the degree of association between the alarm information to be processed corresponding to at least one target operation and maintenance object based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object comprises:
splitting a plurality of alarm information to be processed into at least one alarm information set to be processed, wherein the same target operation and maintenance object exists in at least one operation and maintenance object corresponding to each alarm information to be processed in the alarm information set to be processed;
and determining the degree of association between the alarm information to be processed in each alarm information set to be processed based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object.
5. The method according to claim 4, wherein determining the degree of association between the alarm information to be processed in each set of alarm information to be processed based on the weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object comprises:
determining the information quantity of the alarm information corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed;
calculating based on the information quantity and weight corresponding to the target operation and maintenance object corresponding to each alarm information set to be processed to obtain a calculation result corresponding to each target operation and maintenance object;
and determining the association degree between the alarm information to be processed in each alarm information set to be processed based on the calculation result.
6. The method of claim 5, wherein determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance objects comprises:
determining a set of to-be-processed alarm information with the association degree larger than a preset association degree as the target alarm information set, and determining the to-be-processed alarm information in the target alarm information set as the target alarm information;
and determining a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance objects according to the calculation result based on a normal distribution algorithm.
7. The method of claim 2, wherein before determining the plurality of operation and maintenance objects corresponding to each alarm information based on each alarm information and the alarm object corresponding to each alarm information, the method further comprises:
acquiring network asset information of the target network;
determining an operation and maintenance object list corresponding to the target network based on the network asset information, wherein the operation and maintenance object list is used for recording the object type of the operation and maintenance object.
8. An alarm analysis apparatus, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a plurality of alarm information corresponding to a target network and determining an alarm object corresponding to each alarm information, wherein the alarm object is an object in the target network, and the alarm information is generated when the target network is abnormal;
the first determining module is configured to determine, based on each alarm information and an alarm object corresponding to each alarm information, a plurality of operation and maintenance objects corresponding to each alarm information, where the operation and maintenance objects are objects in the target network and have an association relationship with the alarm object, and a probability that an alarm occurs in the operation and maintenance object when current alarm information is generated is greater than a preset probability;
the second determining module is configured to determine a degree of association between to-be-processed alarm information corresponding to at least one target operation and maintenance object based on a weight corresponding to each target operation and maintenance object in the at least one target operation and maintenance object, where the target operation and maintenance object is an object whose information quantity of the alarm information corresponding to the operation and maintenance object is greater than or equal to a preset value, the weight is used to represent a fault range corresponding to the operation and maintenance object, and the to-be-processed alarm information is information in the multiple alarm information;
and a third determining module, configured to determine a root cause alarm object corresponding to each target alarm information set from the target operation and maintenance object, where each target alarm information set is composed of multiple target alarm information, the target alarm information is alarm information whose association degree in the to-be-processed alarm information is greater than a preset association degree, and the root cause alarm object is an object that causes an alarm phenomenon corresponding to the target alarm information set.
9. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is arranged to execute the alarm analysis method of any of claims 1 to 7 when executed.
10. An electronic device, wherein the electronic device comprises one or more processors; memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement a method for running a program, wherein the program is arranged to perform the alarm analysis method of any of claims 1 to 7 when run.
CN202211131508.6A 2022-09-16 2022-09-16 Alarm analysis method and device, computer readable storage medium and electronic equipment Pending CN115529219A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211131508.6A CN115529219A (en) 2022-09-16 2022-09-16 Alarm analysis method and device, computer readable storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211131508.6A CN115529219A (en) 2022-09-16 2022-09-16 Alarm analysis method and device, computer readable storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN115529219A true CN115529219A (en) 2022-12-27

Family

ID=84696798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211131508.6A Pending CN115529219A (en) 2022-09-16 2022-09-16 Alarm analysis method and device, computer readable storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN115529219A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089224A (en) * 2023-04-11 2023-05-09 宇动源(北京)信息技术有限公司 Alarm analysis method, alarm analysis device, calculation node and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109634819A (en) * 2018-10-26 2019-04-16 阿里巴巴集团控股有限公司 Alarm root is because of localization method and device, electronic equipment
CN109684181A (en) * 2018-11-20 2019-04-26 华为技术有限公司 Alarm root is because of analysis method, device, equipment and storage medium
WO2021217865A1 (en) * 2020-04-29 2021-11-04 平安科技(深圳)有限公司 Method and apparatus for locating root cause of alarm, computer device, and storage medium
EP3975048A1 (en) * 2019-09-29 2022-03-30 ZTE Corporation Method for constructing cloud network alarm root cause relational tree model, device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109634819A (en) * 2018-10-26 2019-04-16 阿里巴巴集团控股有限公司 Alarm root is because of localization method and device, electronic equipment
CN109684181A (en) * 2018-11-20 2019-04-26 华为技术有限公司 Alarm root is because of analysis method, device, equipment and storage medium
EP3975048A1 (en) * 2019-09-29 2022-03-30 ZTE Corporation Method for constructing cloud network alarm root cause relational tree model, device, and storage medium
WO2021217865A1 (en) * 2020-04-29 2021-11-04 平安科技(深圳)有限公司 Method and apparatus for locating root cause of alarm, computer device, and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
袁静;李大伟;陆绍雯;雷鹏;: "智能监控应用平台告警关联大数据分析算法研究", 电信工程技术与标准化, no. 05, 15 May 2019 (2019-05-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089224A (en) * 2023-04-11 2023-05-09 宇动源(北京)信息技术有限公司 Alarm analysis method, alarm analysis device, calculation node and computer readable storage medium

Similar Documents

Publication Publication Date Title
US11710131B2 (en) Method and apparatus of identifying a transaction risk
CN105095056A (en) Method for monitoring data in data warehouse
CN113360722B (en) Fault root cause positioning method and system based on multidimensional data map
CN111190955B (en) Management, distribution and dispatching through checking method based on knowledge graph
CN112769605B (en) Heterogeneous multi-cloud operation and maintenance management method and hybrid cloud platform
CN111352759A (en) Alarm root cause judgment method and device
CN115809183A (en) Method for discovering and disposing information-creating terminal fault based on knowledge graph
CN111767173A (en) Network equipment data processing method and device, computer equipment and storage medium
CN113641526A (en) Alarm root cause positioning method and device, electronic equipment and computer storage medium
CN112988509A (en) Alarm message filtering method and device, electronic equipment and storage medium
CN115529219A (en) Alarm analysis method and device, computer readable storage medium and electronic equipment
CN113949652B (en) User abnormal behavior detection method and device based on artificial intelligence and related equipment
CN114978877A (en) Exception handling method and device, electronic equipment and computer readable medium
US20180129963A1 (en) Apparatus and method of behavior forecasting in a computer infrastructure
CN114172785A (en) Alarm information processing method, device, equipment and storage medium
CN111414355A (en) Offshore wind farm data monitoring and storing system, method and device
CN115514627A (en) Fault root cause positioning method and device, electronic equipment and readable storage medium
CN116089446A (en) Optimization control method and device for structured query statement
CN115408236A (en) Log data auditing system, method, equipment and medium
CN115269519A (en) Log detection method and device and electronic equipment
CN112612679A (en) System running state monitoring method and device, computer equipment and storage medium
CN112966056A (en) Information processing method, device, equipment, system and readable storage medium
CN104883273A (en) Method and system for processing service influence model in virtualized service management platform
CN112905479B (en) Cloud platform-based method and system for determining optimal path of alarm accident root cause
CN115913894A (en) Batch operation analysis alarm method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination