CN111506478A - Method for realizing alarm management control based on artificial intelligence - Google Patents

Method for realizing alarm management control based on artificial intelligence Download PDF

Info

Publication number
CN111506478A
CN111506478A CN202010305223.4A CN202010305223A CN111506478A CN 111506478 A CN111506478 A CN 111506478A CN 202010305223 A CN202010305223 A CN 202010305223A CN 111506478 A CN111506478 A CN 111506478A
Authority
CN
China
Prior art keywords
alarm
data
database
historical
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010305223.4A
Other languages
Chinese (zh)
Inventor
何雄飞
朱广文
张建民
章俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Haofang Information Technology Co ltd
Original Assignee
Shanghai Haofang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Haofang Information Technology Co ltd filed Critical Shanghai Haofang Information Technology Co ltd
Priority to CN202010305223.4A priority Critical patent/CN111506478A/en
Publication of CN111506478A publication Critical patent/CN111506478A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions

Abstract

The invention relates to a method for realizing alarm management control based on artificial intelligence, which comprises the following steps: performing dynamic baseline alarm, automatically learning the operation trend of the service according to historical data, establishing a safe region for service operation according to the up-and-down fluctuation of the service, and monitoring abnormal fluctuation of the service in different time periods; alarm rule mining is carried out, historical data are analyzed and mined, and the rule database is updated and supplemented in an incremental mode; performing alarm analysis processing, and processing the current alarm in an online processing mode; and associating a fault knowledge base, and recommending a method and experience for pushing and solving through similarity association of the problem events. By adopting the method for realizing the alarm management control based on the artificial intelligence, the defect of artificially setting a fixed threshold is overcome, the operation and maintenance workload is greatly reduced, the effective inhibition of the alarm storm and the unified management and control of the alarm message are realized, the interference of mass alarms on operation and maintenance personnel is effectively reduced, and the problem solving efficiency is improved.

Description

Method for realizing alarm management control based on artificial intelligence
Technical Field
The invention relates to the field of artificial intelligence, in particular to the technical field of intelligent operation and maintenance, and specifically relates to a method for realizing alarm management control based on artificial intelligence.
Background
With the rapid penetration of information technology application, business systems are widely deployed and applied to various industries, particularly medical treatment, large-scale enterprises, finance, education and the like, and users are increasingly used for the business systems, and the dependence is higher and higher. The traditional alarm management also has more and more challenges, and currently, the following problems mainly exist:
(1) the traditional alarm management generally uses a fixed threshold value and needs manual setting by operation and maintenance personnel, the mode is large in workload and very dependent on experience of the operation and maintenance personnel, and the result of alarm storm or alarm missing report can be caused by improper setting of the threshold value. When the monitoring environment changes, the original fixed threshold value cannot meet the requirement of alarm management.
(2) Various monitoring tools can generate massive alarm information, a large amount of redundant alarms possibly exist in the alarm information, even alarm storms are formed, great interference is generated on operation and maintenance personnel, and the operation and maintenance work efficiency is reduced.
(3) In the traditional operation and maintenance management, the handling of the fault is very dependent on the experience of the operation and maintenance personnel, but the experience of the personnel cannot cover all fault ranges, and the lack of the experience of the operation and maintenance personnel can cause the operation and maintenance efficiency to be low or generate wrong decisions.
The invention provides a one-stop solution from alarm event discovery, diagnosis and solution by means of an artificial intelligence technology, thereby facilitating operation and maintenance personnel to process important events more quickly and efficiently, reducing fault time and service interruption time, and improving the reliability and high performance of an IT system.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides the method for realizing the alarm management control based on artificial intelligence, which has the advantages of good reliability, high performance and wider application range.
In order to achieve the above purpose, the method for realizing alarm management control based on artificial intelligence of the invention is as follows:
the method for realizing the alarm management control based on the artificial intelligence is mainly characterized by comprising the following steps of:
(1) performing dynamic baseline alarm, automatically learning the operation trend of the service according to historical data, establishing a safe region for service operation according to the up-and-down fluctuation of the service, and monitoring abnormal fluctuation of the service in different time periods;
(2) alarm rule mining is carried out, historical data are analyzed and mined, and increment updating and supplement are carried out on a rule database in a regular mining rule mode;
(3) performing alarm analysis processing, processing the current alarm in an online processing mode, and processing the alarm in real time;
(4) and the associated fault knowledge base is used for pushing a solution method and experience through similarity associated recommendation of the problem events.
Preferably, the step (1) specifically comprises the following steps:
(1.1) exporting a historical index data file from a business system;
(1.2) preprocessing data, reading historical index data, detecting the effectiveness of all data, and screening out invalid data;
(1.3) identifying the trend and periodic change of index historical data according to the learning of the model on the historical data, predicting the change of the index in a period of time in the future, and giving the change conditions of the upper limit and the lower limit in the period of time in the future according to the distribution condition of the historical data;
(1.4) judging whether the index to be detected is higher than the upper limit of the baseline or lower than the lower limit, if so, judging that the abnormality occurs; otherwise, judging that no abnormity occurs.
Preferably, the step (2) specifically comprises the following steps:
(2.1) exporting a historical alarm data file by the service system;
(2.2) preprocessing data, reading historical alarm data, detecting the effectiveness of all data, screening out invalid data, coding the alarm data, and importing the coded alarm data into an alarm database;
(2.3) clustering data, extracting data required by clustering from the alarm database, and dividing the data in time domain and geographical position;
(2.4) carrying out rule mining to obtain clustering results, extracting alarm data from an alarm database, and carrying out association analysis on each cluster of alarm data to realize rule mining;
and (2.5) importing the mined rules into an alarm database to screen effective rules.
Preferably, the step (3) specifically includes the following steps:
(3.1) transmitting the data into a data interface through a background interface;
(3.2) reading the current alarm data through a data interface, and importing the current alarm data into an alarm database through corresponding processing;
(3.3) clustering data, extracting key fields with analysis alarm data from the alarm database, further realizing clustering of the current alarm data, and dividing the data in time domain and geographical position;
and (3.4) performing alarm processing analysis to obtain a clustering result, extracting alarm data from an alarm database, traversing all alarm rules from a rule database, and performing rule engine matching analysis on each cluster of alarm data to obtain a root alarm and realize alarm compression.
Preferably, the step (4) specifically includes the following steps:
and according to the root alarm, searching a solution with the highest matching degree from the fault knowledge base by a text similarity algorithm.
The method for realizing alarm management control based on artificial intelligence overcomes the defect of manually setting a fixed threshold, and greatly reduces the operation and maintenance workload, the alarm missing rate and the false alarm rate. The system realizes effective suppression of the alarm storm and unified management and control of the alarm message, effectively reduces interference of mass alarms to operation and maintenance personnel, and improves problem solving efficiency. The knowledge sharing is realized, the service flow is reduced, and the operation and maintenance cost is reduced. And the operation and maintenance response speed and the service quality are improved. The knowledge loss is avoided, and the talents of enterprises are convenient to not lose.
Drawings
FIG. 1 is a flow chart of a method for implementing alarm management control based on artificial intelligence according to the present invention.
Detailed Description
In order to more clearly describe the technical contents of the present invention, the following further description is given in conjunction with specific embodiments.
The invention relates to a method for realizing alarm management control based on artificial intelligence, which comprises the following steps:
(1) performing dynamic baseline alarm, automatically learning the operation trend of the service according to historical data, establishing a safe region for service operation according to the up-and-down fluctuation of the service, and monitoring abnormal fluctuation of the service in different time periods;
(1.1) exporting a historical index data file from a business system;
(1.2) preprocessing data, reading historical index data, detecting the effectiveness of all data, and screening out invalid data;
(1.3) identifying the trend and periodic change of index historical data according to the learning of the model on the historical data, predicting the change of the index in a period of time in the future, and giving the change conditions of the upper limit and the lower limit in the period of time in the future according to the distribution condition of the historical data;
(1.4) judging whether the index to be detected is higher than the upper limit of the baseline or lower than the lower limit, if so, judging that the abnormality occurs; otherwise, judging that no abnormity occurs;
(2) alarm rule mining is carried out, historical data are analyzed and mined, and increment updating and supplement are carried out on a rule database in a regular mining rule mode;
(2.1) exporting a historical alarm data file by the service system;
(2.2) preprocessing data, reading historical alarm data, detecting the effectiveness of all data, screening out invalid data, coding the alarm data, and importing the coded alarm data into an alarm database;
(2.3) clustering data, extracting data required by clustering from the alarm database, and dividing the data in time domain and geographical position;
(2.4) carrying out rule mining to obtain clustering results, extracting alarm data from an alarm database, and carrying out association analysis on each cluster of alarm data to realize rule mining;
(2.5) importing the mined rules into an alarm database, and screening effective rules;
(3) performing alarm analysis processing, processing the current alarm in an online processing mode, and processing the alarm in real time;
(3.1) transmitting the data into a data interface through a background interface;
(3.2) reading the current alarm data through a data interface, and importing the current alarm data into an alarm database through corresponding processing;
(3.3) clustering data, extracting key fields with analysis alarm data from the alarm database, further realizing clustering of the current alarm data, and dividing the data in time domain and geographical position;
(3.4) carrying out alarm processing analysis to obtain a clustering result, extracting alarm data from an alarm database, traversing all alarm rules from a rule database, and carrying out rule engine matching analysis on each cluster of alarm data to obtain a root alarm and realize alarm compression;
(4) the correlation fault knowledge base is used for pushing a solution method and experience through similarity correlation recommendation of the problem events;
and according to the root alarm, searching a solution with the highest matching degree from the fault knowledge base by a text similarity algorithm.
In the specific implementation of the present invention, aiming at the problems existing in the traditional alarm management, we propose an alarm management method based on artificial intelligence, which can solve the following problems: (1) the defect of artificially setting a fixed threshold is overcome, the development trend of data is intelligently analyzed, and the dynamic limit of the data is analyzed, so that the alarm is intelligently judged. (2) The redundant alarms are combined through similarity and relevance analysis, and the root alarm is found out, so that effective alarm information is provided for operation and maintenance personnel, and the difficulty of operation and maintenance work can be greatly reduced. (3) Operation and maintenance knowledge accumulation and automatic multiplexing, decision support, and more intelligent use.
The alarm management comprises four parts of dynamic baseline alarm, alarm rule mining, alarm analysis processing and fault knowledge base association recommendation. The dynamic baseline is based on historical data, after deep learning is carried out by using an intelligent algorithm, the numerical value of each time point in a period of time in the future is accurately predicted, the predicted value is used as the baseline, and the deviation (percentage difference) between the actual value and the baseline is compared to monitor and alarm. And analyzing a large amount of historical alarm data by using a clustering and association algorithm, and obtaining association rules among alarms to form a rule database. And analyzing and processing the current alarm use rule base to obtain the root alarm and the derivative alarm in the current alarm. And finally, pushing a solution method and experience to a user through the association recommendation of the fault knowledge base, so as to realize the rapid solution of the fault.
The overall solution of the present invention is divided into four parts:
1) dynamic baseline alerting:
the method mainly adopts big data analysis through an automatic AI learning means, automatically learns the operation trend of the service through historical data, and then establishes a safe area for the operation of the service according to the up-and-down floating of the service. The monitoring function is realized through abnormal fluctuation of the service in different time periods, and operation and maintenance personnel can find faults in time.
2) And (3) alarm rule mining:
obtaining association rules among alarms from historical data based on big data analysis of the historical alarm data to form a rule database; and in the alarm rule mining stage, an off-line processing mode is adopted, historical data are analyzed and mined, and the real-time performance is not required. During initial deployment, a large number of historical alarms are acquired, rule mining initialization is carried out to form a rule database, and after deployment in a network, the rule database is subjected to incremental updating and supplementation in a regular mining rule mode.
3) And (3) alarm analysis and processing:
and analyzing and processing the current alarm based on the obtained association rule in the rule database to obtain the root alarm and the derivative alarm in the current alarm. In the alarm analysis processing stage, an online processing mode is adopted to process the current alarm, and the real-time performance is required. And after the software is deployed, processing the alarm in real time.
4) And (3) association recommendation of a fault knowledge base:
a huge operation and maintenance knowledge base is established based on experts, a solution method and experience are pushed to a user through similarity correlation recommendation of problem events, and knowledge support for rapidly solving faults is achieved.
The invention discloses a method for realizing alarm management control based on artificial intelligence, which comprises the following steps:
A) dynamic baseline alarm flow:
1) and exporting the historical index data file by the business system.
2) And (4) preprocessing data, namely after reading the historical index data, detecting the validity of all data and screening out invalid data.
3) And according to the learning of the model to the historical data, identifying the trend and the periodic change of the index historical data, and predicting the change of the index in a future period of time. And simultaneously, according to the distribution condition of the historical data, the change condition of the upper limit and the lower limit in a period of time in the future is given.
4) And when the index to be detected is higher than the baseline and higher than the upper limit or lower than the lower limit, judging that the abnormity occurs.
B) And (3) an alarm rule mining process:
1) and exporting the historical alarm data file by the business system as the learned data of rule mining.
2) And (3) data preprocessing, namely after reading the historical alarm data, detecting the validity of all data, screening out invalid data, coding the alarm data and importing the coded alarm data into an alarm database.
3) And (4) data clustering, namely extracting data required by clustering from the alarm database, and dividing the data in time domain and geographic position by using DBSCAN algorithm.
4) And (4) rule mining, namely acquiring a clustering result, extracting alarm data from an alarm database, and performing association analysis on each cluster of alarm data by using an FP-Growth improved algorithm to realize rule mining.
5) And importing the mined rules into an alarm database, and realizing screening of effective rules through manual intervention of experts.
C) Alarm analysis processing flow description:
1) and the service system transmits the data into the data interface through the background interface.
2) And the data interface is used for importing the current alarm data into an alarm database after corresponding processing.
3) And data clustering, namely extracting key fields with analysis alarm data from the alarm database, further realizing the clustering of the current alarm data, and dividing the data in time domain and geographical position.
4) And alarm processing and analysis are carried out, a clustering result is obtained, and alarm data are extracted from an alarm database. Traversing all alarm rules from the rule database, and performing rule engine matching analysis on each cluster of alarm data to obtain a root alarm and realize alarm compression.
D) The fault knowledge base association recommendation process comprises the following steps:
and b) finding out the Top5 solution with the highest matching degree from the fault knowledge base by using a text similarity algorithm according to the root alarm obtained in the step b), and recommending the solution to the user.
The key technology of the invention is divided into four parts of a dynamic threshold algorithm, an alarm clustering algorithm, an alarm correlation analysis algorithm and a text similarity algorithm:
(1) dynamic threshold algorithm
The problem of false alarm and missing report of the static threshold is solved by using a data statistical analysis method for the dynamic threshold, the cost of manual maintenance is saved, and the monitoring risk is reduced to a certain extent. However, in the presence of a fault of micro fluctuation and continuous yin drop, the dynamic threshold value has limitation, so that a mean shift model is creatively introduced to find a variable point. The change point is that the average value of the monitoring indexes before and after the change point in a period of time shifts after the continuous trace drops for a certain time and the accumulated variation reaches a certain degree.
The main steps of the dynamic threshold algorithm are as follows:
A) selecting a sample: this generally suggests selecting samples over the past 90 days or so, depending on the needs of the user.
B) Screening abnormal samples: this process mainly uses gaussian distribution function to filter out samples with function value less than 0.01, or absolute value of standard deviation greater than 1.
C) Intercepting a sample: and (b) performing segmentation test on the historical samples on the time series by using a mean shift model on the basis of b), if the historical samples have periodic change or continuous monotonous change, repeatedly iterating the mean shift model to find a mean shift point, and then intercepting the sample sequence which is nearest to the current date (or can be understood as the sample sequence which is most stable in the latest period of time). The sample selection also has a problem to be noticed, samples of holidays and workdays are selected separately, the threshold value of the forecast workday is to select the sample of the workday, and the holidays are also selected separately, namely, the forecast samples are selected from three dimensions of date, weekend and stationarity.
D) Prediction reference value: and (c) screening and intercepting the samples b) and c), wherein the rest samples are basically the most ideal samples, the order of the samples on the date is kept on the basis, the reference value of the target date is predicted according to an exponential smoothing method, and the upper and lower limits of the threshold are calculated according to the sensitivity or the threshold coefficient after the reference value is obtained.
Remarking:
algorithm principle of mean shift model: the yin-fall trend which is not easy to identify is converted into a CUSUM time sequence, the trend of the CUSUM time sequence is obvious, the left side of a variable point is monotonically increased, the right side of the variable point is monotonically decreased, the CUSUM time sequence describes the accumulated variation of each point of the monitored time sequence deviating from the mean value, and the two sides of the variable point are monotonically changed. The change point is that the average value of the monitoring indexes before and after the change point in a period of time shifts after the continuous trace drops for a certain time and the accumulated variation reaches a certain degree.
(2) Alarm clustering algorithm
When the traditional association rule analysis is used for counting alarm information, the statistics is usually carried out after a time hard sliding window is used, but the time hard sliding window cannot fully utilize the information, excessive alarms are possibly put into one class, or the alarms which originally belong to the same fault are cut into different classes, so that different root alarms and derived alarms are mixed together, and the accuracy of a counting result is not enough. Therefore, the alarm information is clustered, different root alarms and derivative alarms thereof are distinguished according to the time and geographic attributes of the alarm information, namely, each class represents one root alarm and derivative alarm thereof, and then association rule analysis is carried out, so that the accuracy can be improved.
Clustering based on location and time information is adopted: the alarm data is divided into 'hard' by using accurate place information (network element); and clustering in a time dimension by using a DBSCAN algorithm according to the starting time and the ending time of the alarm.
(3) Alarm correlation analysis algorithm
And adopting an FP-Growth algorithm in the association rule mining algorithm to mine the alarm association rule, and improving the algorithm in the aspects of execution efficiency and redundancy. The concrete improvement is as follows:
A) FP-Growth algorithm execution efficiency improvement
In the actual process of mining the alarm association rule, due to the huge data size, when a transaction consisting of alarms in a certain time window is obtained, the length of a transaction item is too long. Therefore, the efficiency of the mining algorithm is seriously reduced due to the condition mode bases and the condition mode trees with a large number of processes, and the condition that a mining result cannot be obtained for a long time occurs.
Aiming at the problems, the method comprises the following steps: when the conditional mode base of the item with the lowest frequency is solved for one path, all the frequent mode bases of the items with the frequency higher than the item on the current path are solved, the processing marks are added to the nodes on the current path, and then other items are sequentially processed according to the sequence in the item head list. Before processing, checking whether the item in a certain path has a processed identifier, and if not, sequentially further processing; if the data is processed, skipping is carried out, so that the cost of repeatedly backtracking the same road is avoided, and the excavation efficiency can be obviously improved.
b) FP-Growth algorithm redundancy improvement
After the conditional pattern base is generated, a subtree is generated, then a frequent pattern base is found, and the frequent pattern base and the current item set are combined into a frequent item set, in the process, an original program generates all subsets by using the frequent pattern base, and then the subsets are combined with the current base item set into the frequent item set, when the generated frequent pattern base is long, for example, the frequent pattern base of 20 items, there are (220-1) non-empty subsets, and the number of the combined frequent item set is large. This is because all subsets of the frequent itemset are frequent, and here the frequency of most frequent subsets is contributed by their high-order frequent itemsets, which is of no value for practical applications. This is also the reason for the large number of redundant sets of frequent items.
To address this problem, a solution is given: and excavating the maximum frequent item set in the alarm. The comparison is made on the support number for all subsets of the most frequent item set. Starting from the maximum n-order frequent item set, if the support number of the n-1-order subset is more than or equal to the support number of the maximum n-order frequent item set and does not exceed a constant c (the constant can be corrected according to the actual mining result situation), the support number of the subset is contributed by the maximum n-order frequent item set, and the subset does not appear frequently, so that the item set is removed; the frequent item sets meeting the above are stored, which shows that the alarms in the subsets have stronger relevance.
The FP-Growth algorithm in the original association rule mining algorithm has the problems of poor execution efficiency and generation of a large number of redundant rules in a certain scene, and the algorithm mining effect is seriously influenced. Therefore, the algorithm is improved, and the improved FP-Growth algorithm can dig out alarm association rules in a short time; meanwhile, by removing redundancy, while keeping effective results, 90% of redundant results are removed, so that the mining efficiency of the algorithm is greatly improved, and the later-stage association rule checking work is facilitated.
(4) Text similarity calculation method
There are many methods for calculating text similarity, we adopt a cosine similarity calculation method, and improve the algorithm, introduce word weights wi, j on the basis of a cosine vector algorithm model, and the improved cosine algorithm model is as follows:
the original cosine similarity algorithm is:
Figure BDA0002455539590000081
after introduction of wi, j, then
Figure BDA0002455539590000082
The similarity of the vectors T1 and T2 in the three-dimensional space can be measured by the corresponding included angles of the vectors T1 and T2. From mathematical knowledge, it can be known that, when the cosine values between T1 and T2 are 1, the similarity reaches the maximum value of 1, the directions between vectors are very consistent, and the possible similarity degree is higher; when the cosine values between T1 and T2 are 0, the similarity reaches the minimum value of 0, and the more the directions between the vectors are not consistent, the lower the possible similarity degree is; the cosine values can be put between [0-1] to represent the similarity between different texts.
Figure BDA0002455539590000091
According to the well-known inequalities of sugar water among mathematics, there are:
Figure BDA0002455539590000092
therefore, the accuracy of text similarity calculation can be improved by improving the post cosine similarity calculation model.
By creatively introducing word weight on the basis of a cosine vector algorithm, the accuracy of calculating the text similarity is greatly improved.
The method for realizing alarm management control based on artificial intelligence overcomes the defect of manually setting a fixed threshold, and greatly reduces the operation and maintenance workload, the alarm missing rate and the false alarm rate. The system realizes effective suppression of the alarm storm and unified management and control of the alarm message, effectively reduces interference of mass alarms to operation and maintenance personnel, and improves problem solving efficiency. The knowledge sharing is realized, the service flow is reduced, and the operation and maintenance cost is reduced. And the operation and maintenance response speed and the service quality are improved. The knowledge loss is avoided, and the talents of enterprises are convenient to not lose.
In this specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims (5)

1. A method for realizing alarm management control based on artificial intelligence is characterized by comprising the following steps:
(1) performing dynamic baseline alarm, automatically learning the operation trend of the service according to historical data, establishing a safe region for service operation according to the up-and-down fluctuation of the service, and monitoring abnormal fluctuation of the service in different time periods;
(2) alarm rule mining is carried out, historical data are analyzed and mined, and increment updating and supplement are carried out on a rule database in a regular mining rule mode;
(3) performing alarm analysis processing, processing the current alarm in an online processing mode, and processing the alarm in real time;
(4) and the associated fault knowledge base is used for pushing a solution method and experience through similarity associated recommendation of the problem events.
2. The method for realizing alarm management control based on artificial intelligence according to claim 1, wherein the step (1) specifically comprises the following steps:
(1.1) exporting a historical index data file from a business system;
(1.2) preprocessing data, reading historical index data, detecting the effectiveness of all data, and screening out invalid data;
(1.3) identifying the trend and periodic change of index historical data according to the learning of the model on the historical data, predicting the change of the index in a period of time in the future, and giving the change conditions of the upper limit and the lower limit in the period of time in the future according to the distribution condition of the historical data;
(1.4) judging whether the index to be detected is higher than the upper limit of the baseline or lower than the lower limit, if so, judging that the abnormality occurs; otherwise, judging that no abnormity occurs.
3. The method for realizing alarm management control based on artificial intelligence according to claim 1, wherein the step (2) specifically comprises the following steps:
(2.1) exporting a historical alarm data file by the service system;
(2.2) preprocessing data, reading historical alarm data, detecting the effectiveness of all data, screening out invalid data, coding the alarm data, and importing the coded alarm data into an alarm database;
(2.3) clustering data, extracting data required by clustering from the alarm database, and dividing the data in time domain and geographical position;
(2.4) carrying out rule mining to obtain clustering results, extracting alarm data from an alarm database, and carrying out association analysis on each cluster of alarm data to realize rule mining;
and (2.5) importing the mined rules into an alarm database to screen effective rules.
4. The method for realizing alarm management control based on artificial intelligence according to claim 1, wherein the step (3) specifically comprises the following steps:
(3.1) transmitting the data into a data interface through a background interface;
(3.2) reading the current alarm data through a data interface, and importing the current alarm data into an alarm database through corresponding processing;
(3.3) clustering data, extracting key fields with analysis alarm data from the alarm database, further realizing clustering of the current alarm data, and dividing the data in time domain and geographical position;
and (3.4) performing alarm processing analysis to obtain a clustering result, extracting alarm data from an alarm database, traversing all alarm rules from a rule database, and performing rule engine matching analysis on each cluster of alarm data to obtain a root alarm and realize alarm compression.
5. The method for realizing alarm management control based on artificial intelligence according to claim 1, wherein the step (4) specifically comprises the following steps:
and according to the root alarm, searching a solution with the highest matching degree from the fault knowledge base by a text similarity algorithm.
CN202010305223.4A 2020-04-17 2020-04-17 Method for realizing alarm management control based on artificial intelligence Pending CN111506478A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010305223.4A CN111506478A (en) 2020-04-17 2020-04-17 Method for realizing alarm management control based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010305223.4A CN111506478A (en) 2020-04-17 2020-04-17 Method for realizing alarm management control based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN111506478A true CN111506478A (en) 2020-08-07

Family

ID=71864101

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010305223.4A Pending CN111506478A (en) 2020-04-17 2020-04-17 Method for realizing alarm management control based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN111506478A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416724A (en) * 2020-12-04 2021-02-26 中国建设银行股份有限公司 Alarm processing method, system, computer equipment and storage medium
CN112737839A (en) * 2020-12-28 2021-04-30 上海联蔚盘云科技有限公司 Method and equipment for self-adaptive fault repair in multi-public cloud environment
CN112863134A (en) * 2020-12-31 2021-05-28 浙江清华长三角研究院 Intelligent diagnosis system and method for rural sewage treatment facility abnormal operation
CN112926749A (en) * 2020-12-30 2021-06-08 国网宁夏电力有限公司信息通信公司 Intelligent power grid information equipment monitoring system and method
CN113032235A (en) * 2021-03-31 2021-06-25 上海天旦网络科技发展有限公司 Operation and maintenance measure recommendation method and system based on system index and command call log
CN113259379A (en) * 2021-06-15 2021-08-13 中国航空油料集团有限公司 Abnormal alarm identification method, device, server and storage medium based on incremental learning
CN113516565A (en) * 2021-04-08 2021-10-19 国家电网有限公司 Intelligent alarm processing method and device for power monitoring system based on knowledge base
CN113537760A (en) * 2021-07-14 2021-10-22 深圳供电局有限公司 Intelligent recommendation method and system for fault handling plan
CN113792161A (en) * 2021-09-16 2021-12-14 陈刚 Method for mining frequent fault in alarm
CN113946464A (en) * 2021-10-19 2022-01-18 腾云悦智科技(深圳)有限责任公司 Alarm noise reduction method combining model and experience pre-training and parallel deduction
CN114866396A (en) * 2022-07-07 2022-08-05 浩鲸云计算科技股份有限公司 Method for realizing network fault location under inaccurate resources based on text similarity
CN114880151A (en) * 2022-04-25 2022-08-09 北京科杰科技有限公司 Artificial intelligence operation and maintenance method
CN115081969A (en) * 2022-08-23 2022-09-20 中国中金财富证券有限公司 Abnormal data determination method and related device
CN115658444A (en) * 2022-10-31 2023-01-31 北京泰策科技有限公司 Alarm system for adaptive rule generation based on statistical learning optimization
CN116054416A (en) * 2023-03-15 2023-05-02 扬州康德电气有限公司 Intelligent monitoring operation and maintenance management system based on Internet of things
CN116662673A (en) * 2023-07-28 2023-08-29 西安银信博锐信息科技有限公司 User preference data analysis method based on data monitoring
CN117040909A (en) * 2023-09-11 2023-11-10 江南信安(北京)科技有限公司 Method and system for carrying out safety protection on network equipment
CN114422322B (en) * 2021-12-29 2024-04-30 中国电信股份有限公司 Alarm compression method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142992A (en) * 2011-01-11 2011-08-03 浪潮通信信息系统有限公司 Communication alarm frequent itemset mining engine and redundancy processing method
CN106779505A (en) * 2017-02-28 2017-05-31 中国南方电网有限责任公司 A kind of transmission line malfunction method for early warning driven based on big data and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142992A (en) * 2011-01-11 2011-08-03 浪潮通信信息系统有限公司 Communication alarm frequent itemset mining engine and redundancy processing method
CN106779505A (en) * 2017-02-28 2017-05-31 中国南方电网有限责任公司 A kind of transmission line malfunction method for early warning driven based on big data and system

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
佚名: "文本相似度算法研究" *
周钦亮等: "一种新的高效生成FP-Tree条件模式基的算法", 计算机应用, vol. 26, no. 6, pages 1418 - 1421 *
徐铮等: "基于案例推理方法的国网运行态势告警模型的研究", vol. 48, no. 48, pages 723 - 727 *
陆斌等: "基于人工智能的网络告警关联分析处理的应用", pages 1 - 6 *
颜伟等: "基于优化的FP-Tree的频繁闭合项集挖掘算法", 曲阜师范大学报, vol. 35, no. 2, pages 57 - 61 *
马小鹏: "阿里 Goldeneye 四个环节落地智能监控:预测、检测、报警及定位", pages 1 - 3 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416724A (en) * 2020-12-04 2021-02-26 中国建设银行股份有限公司 Alarm processing method, system, computer equipment and storage medium
CN112737839A (en) * 2020-12-28 2021-04-30 上海联蔚盘云科技有限公司 Method and equipment for self-adaptive fault repair in multi-public cloud environment
CN112926749A (en) * 2020-12-30 2021-06-08 国网宁夏电力有限公司信息通信公司 Intelligent power grid information equipment monitoring system and method
CN112863134A (en) * 2020-12-31 2021-05-28 浙江清华长三角研究院 Intelligent diagnosis system and method for rural sewage treatment facility abnormal operation
CN113032235A (en) * 2021-03-31 2021-06-25 上海天旦网络科技发展有限公司 Operation and maintenance measure recommendation method and system based on system index and command call log
CN113516565A (en) * 2021-04-08 2021-10-19 国家电网有限公司 Intelligent alarm processing method and device for power monitoring system based on knowledge base
CN113259379A (en) * 2021-06-15 2021-08-13 中国航空油料集团有限公司 Abnormal alarm identification method, device, server and storage medium based on incremental learning
CN113537760A (en) * 2021-07-14 2021-10-22 深圳供电局有限公司 Intelligent recommendation method and system for fault handling plan
CN113792161A (en) * 2021-09-16 2021-12-14 陈刚 Method for mining frequent fault in alarm
CN113946464A (en) * 2021-10-19 2022-01-18 腾云悦智科技(深圳)有限责任公司 Alarm noise reduction method combining model and experience pre-training and parallel deduction
CN114422322B (en) * 2021-12-29 2024-04-30 中国电信股份有限公司 Alarm compression method, device, equipment and storage medium
CN114880151A (en) * 2022-04-25 2022-08-09 北京科杰科技有限公司 Artificial intelligence operation and maintenance method
CN114866396A (en) * 2022-07-07 2022-08-05 浩鲸云计算科技股份有限公司 Method for realizing network fault location under inaccurate resources based on text similarity
CN115081969B (en) * 2022-08-23 2023-05-09 中国中金财富证券有限公司 Abnormal data determination method and related device
CN115081969A (en) * 2022-08-23 2022-09-20 中国中金财富证券有限公司 Abnormal data determination method and related device
CN115658444A (en) * 2022-10-31 2023-01-31 北京泰策科技有限公司 Alarm system for adaptive rule generation based on statistical learning optimization
CN116054416A (en) * 2023-03-15 2023-05-02 扬州康德电气有限公司 Intelligent monitoring operation and maintenance management system based on Internet of things
CN116054416B (en) * 2023-03-15 2023-09-22 扬州康德电气有限公司 Intelligent monitoring operation and maintenance management system based on Internet of things
CN116662673A (en) * 2023-07-28 2023-08-29 西安银信博锐信息科技有限公司 User preference data analysis method based on data monitoring
CN116662673B (en) * 2023-07-28 2023-11-03 西安银信博锐信息科技有限公司 User preference data analysis method based on data monitoring
CN117040909A (en) * 2023-09-11 2023-11-10 江南信安(北京)科技有限公司 Method and system for carrying out safety protection on network equipment

Similar Documents

Publication Publication Date Title
CN111506478A (en) Method for realizing alarm management control based on artificial intelligence
WO2021184630A1 (en) Method for locating pollutant discharge object on basis of knowledge graph, and related device
CN110865929B (en) Abnormality detection early warning method and system
CN111259947A (en) Power system fault early warning method and system based on multi-mode learning
CN109753591A (en) Operation flow predictability monitoring method
CN112528519A (en) Method, system, readable medium and electronic device for engine quality early warning service
CN110297207A (en) Method for diagnosing faults, system and the electronic device of intelligent electric meter
CN114465874B (en) Fault prediction method, device, electronic equipment and storage medium
CN109992484B (en) Network alarm correlation analysis method, device and medium
CN113032238A (en) Real-time root cause analysis method based on application knowledge graph
CN111199361A (en) Electric power information system health assessment method and system based on fuzzy reasoning theory
CN114201374A (en) Operation and maintenance time sequence data anomaly detection method and system based on hybrid machine learning
CN112769605A (en) Heterogeneous multi-cloud operation and maintenance management method and hybrid cloud platform
CN114327964A (en) Method, device, equipment and storage medium for processing fault reasons of service system
CN115809183A (en) Method for discovering and disposing information-creating terminal fault based on knowledge graph
CN114978877B (en) Abnormality processing method, abnormality processing device, electronic equipment and computer readable medium
CN117196159A (en) Intelligent water service partition metering system based on Internet big data analysis
Banik et al. Anomaly detection techniques in smart grid systems: A review
CN112039907A (en) Automatic testing method and system based on Internet of things terminal evaluation platform
CN111934903A (en) Docker container fault intelligent prediction method based on time sequence evolution genes
GB2465860A (en) A directed graph behaviour model for monitoring a computer system in which each node of the graph represents an event generated by an application
Figueirêdo et al. Detecting interesting and anomalous patterns in multivariate time-series data in an offshore platform using unsupervised learning
CN116108376A (en) Monitoring system and method for preventing electricity stealing, electronic equipment and medium
CN116126807A (en) Log analysis method and related device
CN114518988B (en) Resource capacity system, control method thereof, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination