CN111338915B - Dynamic alarm grading method and device, electronic equipment and storage medium - Google Patents

Dynamic alarm grading method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111338915B
CN111338915B CN202010411127.8A CN202010411127A CN111338915B CN 111338915 B CN111338915 B CN 111338915B CN 202010411127 A CN202010411127 A CN 202010411127A CN 111338915 B CN111338915 B CN 111338915B
Authority
CN
China
Prior art keywords
alarm
data
online
index
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010411127.8A
Other languages
Chinese (zh)
Other versions
CN111338915A (en
Inventor
赵能文
刘大鹏
隋楷心
张文池
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bishi Technology Co ltd
Original Assignee
Beijing Bishi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bishi Technology Co ltd filed Critical Beijing Bishi Technology Co ltd
Priority to CN202010411127.8A priority Critical patent/CN111338915B/en
Publication of CN111338915A publication Critical patent/CN111338915A/en
Application granted granted Critical
Publication of CN111338915B publication Critical patent/CN111338915B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Alarm Systems (AREA)

Abstract

The invention discloses a dynamic alarm grading method and device, electronic equipment and a computer readable storage medium. The method comprises the following steps: training the sequencing model by using the historical data of the alarm to obtain a training model; and sequencing the on-line data of the alarm by using the training model to obtain an alarm rating. The invention initiatively models the dynamic alarm grading problem into a sequencing problem based on machine learning, gives the severity grading of the alarm in an online and self-adaptive manner based on the training model, and has high grading accuracy, so that an engineer can process the serious alarm preferentially according to the alarm grading, and the failure solving efficiency is improved.

Description

Dynamic alarm grading method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of system alarms, and in particular, to a dynamic alarm ranking method, apparatus, electronic device, and computer-readable storage medium.
Background
A large online service system consists of many components to support a large number of concurrent users. In order to ensure the service quality and the user experience, various monitoring data such as indexes, logs, call chains and the like need to be collected from each component, and a plurality of alarm rules are manually set, so that an alarm is generated once the monitoring data violates the alarm rules (for example, the CPU utilization rate exceeds 80%, a fail keyword appears in a log file and the like), and is sent to an engineer for checking. If the alarm is severe, the engineer may create a work order for troubleshooting and diagnosis. The alarm data may contain a number of attributes such as alarm time, alarm content, alarm type, alarm source system, alarm source machine, alarm level, and alarm off time.
Due to the complexity and dynamics of online services, the system may concurrently generate a large number of alarms, beyond the processing power of the engineer. Therefore, in practice, the classification rules are often defined manually, and the alarms are classified into different priorities (e.g., P1-error, P2-warning, P3-info; CPU utilization is P1 for over 90% and P2 for over 70%). The engineer is mainly concerned with the highest level of alarms, i.e. critical alarms. However, even so, the amount of triggering of a severe alarm is still large. In addition, manual definition and maintenance rules are difficult to have unified standard switching, manpower is consumed, the accuracy of the rule-based alarm grading method is not high, and the rule-based alarm grading method cannot adapt to dynamic changes of a system.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present invention provide an accurate and adaptive alarm ranking algorithm, which can automatically rank the severity of a large number of concurrent alarms, and preferentially recommend the severe alarms to an engineer, thereby helping the engineer to quickly find a potential fault and reduce fault repairing time.
One aspect of the present invention provides a dynamic alarm ranking method, which includes the following steps:
training the sequencing model by using the historical data of the alarm to obtain a training model; and
and sequencing the on-line data of the alarm by using the training model to obtain an alarm rating.
Optionally, the historical data includes a work order, alarm data and index data;
the step of training the sequencing model by using the historical data of the alarm to obtain a training model comprises the following steps:
extracting the label of the work order;
extracting alarm characteristics of the alarm data, extracting index characteristics of the index data, and combining the alarm characteristics and the index characteristics to obtain a characteristic vector; and
and inputting the labels and the feature vectors into the sequencing model, and training the sequencing model to obtain the training model.
Optionally, the online data includes online alarm data and online index data;
the step of using the training model to sequence the on-line data of the alarm to obtain the alarm rating comprises the following steps:
extracting alarm characteristics of the online alarm data, extracting index characteristics of the online index data, and combining the alarm characteristics and the index characteristics to obtain an online characteristic vector; and
and inputting the online characteristic vector into the training model to obtain the alarm rating.
Optionally, the alarm feature includes at least one of the following features: text features, text entropy, timing features; wherein the text features are obtained using a learning-based two-word topic model (BTM); the text entropy is calculated by adopting Inverse Document Frequency (IDF); the time sequence characteristics comprise the alarm frequency, the alarm period, the alarm quantity in unit time or the alarm interval time;
the index features are obtained by adopting a Long Short Term Memory (LSTM) network-based multi-time series anomaly detection algorithm.
In another aspect of the present invention, there is provided a dynamic alarm rating device, including:
the off-line training module is used for training the sequencing model by using the historical data of the alarm to obtain a training model; and
and the online sequencing module is used for sequencing the online data of the alarm by using the training model to obtain the alarm rating.
Optionally, the historical data includes a work order, alarm data and index data;
the offline training module comprises:
the label extraction module is used for extracting the label of the work order;
the characteristic vector extraction module is used for extracting the alarm characteristics of the alarm data, extracting the index characteristics of the index data and combining the alarm characteristics and the index characteristics to obtain a characteristic vector; and
and the model training module is used for inputting the labels and the feature vectors into the sequencing model and training the sequencing model to obtain the training model.
Optionally, the online data includes online alarm data and online index data;
the online ranking module comprises:
the online characteristic vector extraction module is used for extracting the alarm characteristics of the online alarm data, extracting the index characteristics of the online index data and combining the alarm characteristics and the index characteristics to obtain an online characteristic vector; and
and the grading module is used for inputting the online characteristic vector into the training model to obtain the alarm grading.
Optionally, the alarm feature includes at least one of the following features: text features, text entropy, timing features; wherein the text features are obtained using a learning-based two-word topic model (BTM); the text entropy is calculated by adopting Inverse Document Frequency (IDF); the time sequence characteristics comprise the alarm frequency, the alarm period, the alarm quantity in unit time or the alarm interval time;
the index features are obtained by adopting a Long Short Term Memory (LSTM) network-based multi-time series anomaly detection algorithm.
Another aspect of the present invention is to provide an electronic device, including:
at least one processor; and
a memory coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to implement the method of the present invention.
Another aspect of the present invention is to provide a computer-readable storage medium, in which a computer program is stored, which, when executed, is capable of implementing the method of the present invention.
According to the method, a work order and an alarm handling record in historical data are utilized, severity scores are automatically marked for each historical alarm, and a series of interpretable and physically meaningful features are extracted from alarm data and index data to represent the severity of the alarm based on the ideas of data fusion and feature fusion. The invention initiatively models the dynamic alarm grading problem into a sequencing problem based on machine learning, gives the severity grading of the alarm in an online and self-adaptive manner based on the training model, and has high grading accuracy, so that an engineer can process the serious alarm preferentially according to the alarm grading, and the failure solving efficiency is improved.
Drawings
FIG. 1 is a flow chart of a dynamic alarm ranking method in an embodiment of the invention;
FIG. 2 is a flow chart of a dynamic alarm ranking method in an embodiment of the invention;
FIG. 3 is a flowchart illustrating the step S1 of the dynamic alarm ranking method in accordance with an embodiment of the present invention;
FIG. 4 is a flowchart illustrating the step S2 of the dynamic alarm ranking method in accordance with an embodiment of the present invention;
FIG. 5a is a bar graph of alarm severity score (Severityscore) corresponding to 14 topics (Topic) in an embodiment of the present invention;
FIG. 5b is a graph of a warning severity score (Severityscore) line corresponding to text Entropy (Encopy) in an embodiment of the present invention;
FIG. 5c is a line drawing of alarm Severity scores (sensitivity score) corresponding to business system anomaly scores (multivariable error of businessKPIs) and machine anomaly scores (multivariable error of server KPIs) in the embodiment of the present invention;
FIG. 6 is a block diagram of a dynamic alarm rating device in an embodiment of the present invention;
FIG. 7 is a block diagram of an offline training module in an embodiment of the present invention;
FIG. 8 is a block diagram of an online ranking module in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in FIG. 1, the technical solution of the present invention is divided into two stages of off-line training and on-line sequencing. In the off-line training stage, because a work order is generally created for serious alarm and historical alarm processing records are stored, the severity score of the historical alarm data can be obtained as a label according to the work order in the historical data; based on the ideas of data fusion and feature fusion, respectively extracting features from alarm data and index data of historical data to obtain feature vectors, wherein the feature vectors can be used for representing the abnormal degree of the alarm; and training the sequencing model according to the obtained labels and the feature vectors to obtain a training model. In the on-line sequencing stage, the feature vectors of the alarm data arriving in real time are obtained by the same feature extraction method and input into the learned sequencing model, and the sequencing model can output the severity ranking of a large number of concurrent alarms at the current moment, so that an engineer can be guided to preferentially process the alarms with higher severity.
According to one aspect of the invention, a dynamic alarm ranking method is provided.
As shown in fig. 2, the method specifically includes the following steps:
s1: training the sequencing model by using the historical data of the alarm to obtain a training model;
s2: and sequencing the on-line data of the alarm by using the training model to obtain an alarm rating.
In one embodiment, the historical data of the alarm may include work orders, alarm data, index data, and the like, and the online data of the alarm may include online alarm data, online index data, and the like.
As shown in fig. 3, the S1 step further includes:
s101: extracting the label of the work order;
s102: extracting alarm characteristics of the alarm data, extracting index characteristics of the index data, and combining the alarm characteristics and the index characteristics to obtain a characteristic vector; and
s103: and inputting the labels and the feature vectors into the sequencing model, and training the sequencing model to obtain the training model.
As shown in fig. 4, the S2 step further includes:
s201: extracting alarm characteristics of the online alarm data, extracting index characteristics of the online index data, and combining the alarm characteristics and the index characteristics to obtain an online characteristic vector; and
s202: and inputting the online characteristic vector into the training model to obtain the alarm rating.
The method for acquiring the label, the alarm characteristic and the index characteristic of the work order is explained in detail below.
First, label
The severity score of the historical alarm data is marked by the work order in the historical data. Generally, for historical alarms, engineers review the alarms and contact the associated administrator for treatment, and then record treatment records, and serious alarms create a work order that is followed and reviewed by the person in charge. The handling record of the work order is manually filled by a person, and the severity of the alarm can be reflected more truly and reliably. The alarm handling records of the work order can be clustered by using a text clustering method based on TF-IDF and k-means, and the alarms in each cluster obtained by clustering are respectively subjected to unified severity scoring, wherein the severity scoring is the marking of the work order. Several categories of alarm handling records and corresponding severity scores are presented below, with 1 being the highest severity and 0 being the lowest severity. For example, if a work order is created for a certain alarm, the alarm is determined by the engineer to be of a higher severity level; if an alarm is on the white list or there is no record of processing, its severity will be low.
First, None (0)
Second, alarm in white list (0.1)
Third, the alarm has been automatically restored (0.2)
Fourth, contact the application manager, confirm the impact on the business (0.4)
Fifth, for known reasons, the alarm is fixed (0.6)
Sixth, contact the application manager, having an impact on the business, is now repaired (0.8)
Seventh, create event ticket, continue follow-up (1)
Second, alarm feature
Before extracting the alarm characteristics from the alarm data, some preprocessing needs to be performed on the alarm data. If the alarm data is a text mixed with Chinese and English, the alarm content needs to be segmented by Chinese segmentation (such as Chinese segmentation). In addition, the alarm content is generally semi-structured text, which contains more stop words, symbols and variables. Therefore, it is necessary to remove meaningless stop words and symbols, and then, by using the method of extracting the log template, process the alarm data, filter out the variables, and obtain the alarm template. Thereafter, the alert features may be extracted from:
first, the subject feature
The alarm template can be regarded as a short text in the operation and maintenance field, semantic information contained in different alarm contents is different, and different semantics often correspond to different severity. Therefore, the invention applies the popular theme model in the natural language processing to the alarm data for the first time to extract the hidden semantic features from the alarm content. In consideration of the short text characteristics of the alarm content, it is not ideal to directly use the conventional lda (late Dirichlet allocation) Topic Model, and therefore, a Bilingual Topic Model (BTM) designed for short text may be used. By preprocessing the alarm data, some problems (such as Chinese and English mixing, stop words and the like) existing in the alarm data are overcome, so that the BTM can dig out interpretable semantic information. Given a topic number n, the BTM can find hidden topics and keywords corresponding to each topic, and select the optimal topic number according to a coherence score (coherence score). The topics learned by the BTM in the actual alarm data and the corresponding keywords (n =14) are shown below.
T # 1: oracle, connection, database, space, pool, process, lock.
T # 2: syslog, alarm, error, stack, record, hardware, alarm.
T # 3: monitoring, environment, host, battery, humidity, machine room, voltage.
...
T # 13: system, transaction amount, response, threshold, time, traffic, value.
T # 14: switch, virtual, communication, connection, response, ping, network.
It can be seen that the keywords given by the BTM are all of a certain physical meaning and interpretability, for example, topic 1 (T # 1) can be presumed to be related to database alarms, and topic 2 (T # 2) is related to system log alarms. For a piece of alarm information, the BTM may give a probability that it belongs to each topic, and the probability corresponding to each topic is the topic feature of the piece of alarm information. For example for an alarm: "oracle table space usage reaches 78% and exceeds the threshold", the BTM can give a subject feature which is a vector of length 14 [0.78,0.05,0.02, …,0.19,0.04 ].
Fig. 5a shows the alarm Severity scores (Severity score) for the 14 topics (Topic) described above. It can be seen that the subject matter features are meaningful to differentiate the severity of an alert.
Second, text entropy
The alarm content is often a combination of words, different words having different weights in identifying serious alarms, e.g. "break" may be more serious than "port". Therefore, the text entropy of the alarm data can be extracted to measure the severity degree of the alarm. In the text mining technology, Inverse Document Frequency (IDF) is a commonly used method for measuring the importance of words, and the method can reduce the weight of commonly used words and increase the weight of rare words. For the word ω, its word entropy is calculated:
Figure 319438DEST_PATH_IMAGE002
whereinNIs the number of all alarms, NωIs the number of alarms containing the word ω. From this word entropy, words that often appear in alerts have a lower severity. And the text entropy of a certain alarm is contained by itThe average of all word entropies is calculated.
FIG. 5b illustrates the alarm Severity score (sensitivity score) for text Entropy (control). It can be seen that as the value of text entropy increases, the corresponding alert severity score also tends to increase.
Third, timing characteristics
The timing characteristics may include the frequency of alarm occurrences, the period, the number of alarms, the interval time, etc.
Frequency: generally, the more frequently an alarm has historically occurred (e.g., CPU usage exceeds a threshold i value), the less severe it will be; conversely, if an alarm has historically been infrequent (low frequency alarms, such as a server down), its severity may be relatively high, requiring attention from engineers.
And (3) period: some alarms occur periodically, such as batch processing tasks during the night each day, which typically are redundant, causing high CPU utilization alarms. For one alarm a, we can obtain the time series c (a) = { c =1(a),c2(a),…,ck(a) In which c isk(a) Is the number of times the alarm occurred in the kth time slice. Obviously, if the alarm a is periodic, the corresponding time series c (a) is also periodic.
The periodicity of the time series is characterized by an Auto-Correlation Function (ACF). When a time series x (i) having a length N and different lag times l is given, the following is calculated:
Figure 518338DEST_PATH_IMAGE003
if the time series is periodic and the period is T, thenACF(l)Will show a sharp peak at T,2T,3T …. Thus can useACF(l)The maximum value of (c) characterizes the periodicity of the alarm.
The alarm quantity is as follows: generally, when a large number of alarms occur in a short time, it means that a serious malfunction may occur, requiring attention of an engineer.
The interval time is as follows: the interval time is the time interval between the current alarm and the previous alarm. Often, engineer attention is required if there is no alarm for a long time interval and an alarm suddenly appears.
Fourth, other features
In addition to the above-mentioned feature types, some key features may also be extracted from some other attributes of the alarm data itself, such as:
rule-based severity: the severity of the alarm data based on rules has certain reference significance for alarm grading;
and (3) warning time: the time of occurrence of an alarm also has an impact on the severity of the alarm, such as alarms during peak periods of traffic are relatively more important; according to the time when the alarm is sent, a series of time characteristics can be extracted, such as: whether the work day/holiday, day/night, during peak business period, etc.;
alarm type: generally, the alarm of the application class is important because it is strongly related to the quality of service.
Index characteristics
Some key business indexes can directly reflect the health state of the system. When an alarm occurs, if the key service indexes (such as transaction amount, response time, success rate and the like) of the corresponding service system and the key indexes (such as CPU utilization rate, memory utilization rate and the like) of the machine are greatly abnormal, the alarm is a serious alarm with high probability. A Long Short Term Memory (LSTM) network-based multi-time sequence anomaly detection algorithm may be employed to capture the anomaly scores corresponding to each key business index and machine index and the anomaly scores of the overall business system as the index features.
FIG. 5c shows alarm severity scores (Severitscore) corresponding to business system anomaly scores (Multivariate error of business KPIs) and machine anomaly scores (Multivariate error of server KPIs). It can be seen that the higher the system's anomaly score, the higher the severity of the alarm.
According to one aspect of the present invention, a dynamic alarm rating device is provided.
As shown in fig. 6, the apparatus includes:
the off-line training module 10 is used for training the sequencing model by using the historical data of the alarm to obtain a training model; and
and the online sequencing module 20 is configured to sequence the online data of the alarm by using the training model to obtain an alarm rating.
In one embodiment, the historical data includes work orders, alarm data, and indicator data.
As shown in fig. 7, the offline training module 10 includes:
a label extraction module 101, configured to extract a label of the work order;
a feature vector extraction module 102, configured to extract an alarm feature of the alarm data, extract an index feature of the index data, and combine the alarm feature and the index feature to obtain a feature vector; and
and the model training module 103 is configured to input the labels and the feature vectors into the ranking model, and train the ranking model to obtain the training model.
In one embodiment, the online data includes online alarm data and online indicator data.
As shown in fig. 8, the online ranking module 20 includes:
an online feature vector extraction module 201, configured to extract an alarm feature of the online alarm data, extract an index feature of the online index data, and combine the alarm feature and the index feature to obtain an online feature vector; and
and the grading module 202 is used for inputting the online feature vector into the training model to obtain the alarm grading.
According to another aspect of the present invention, there is provided an electronic apparatus, comprising:
at least one processor; and
a memory coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to implement the method of the present invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when the computer program is executed, the method of the present invention can be implemented.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and devices may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method for transmitting/receiving the power saving signal according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (6)

1. A dynamic alarm rating method, comprising the steps of:
training the sequencing model by using the historical data of the alarm to obtain a training model; and
sequencing the on-line data of the alarm by using the training model to obtain an alarm rating;
the online data comprises online alarm data and online index data;
the step of using the training model to sequence the on-line data of the alarm to obtain the alarm rating comprises the following steps:
extracting alarm characteristics of the online alarm data, extracting index characteristics of the online index data, and combining the alarm characteristics and the index characteristics to obtain an online characteristic vector; and
inputting the online characteristic vector into the training model to obtain the alarm rating;
the historical data comprises work orders, alarm data and index data;
the step of training the sequencing model by using the historical data of the alarm to obtain a training model comprises the following steps:
extracting the label of the work order;
extracting alarm characteristics of the alarm data, extracting index characteristics of the index data, and combining the alarm characteristics and the index characteristics to obtain a characteristic vector; and
and inputting the labels and the feature vectors into the sequencing model, and training the sequencing model to obtain the training model.
2. The dynamic alarm rating method of claim 1,
the alert feature comprises at least one of the following features: text characteristics, text entropy and time sequence characteristics; wherein the text features are obtained using a learning-based two-word topic model (BTM); the text entropy is calculated by adopting Inverse Document Frequency (IDF); the time sequence characteristics comprise the alarm frequency, the alarm period, the alarm quantity in unit time or the alarm interval time;
the index features are obtained by adopting a Long Short Term Memory (LSTM) network-based multi-time series anomaly detection algorithm.
3. A dynamic alarm rating device, the device comprising:
the off-line training module is used for training the sequencing model by using the historical data of the alarm to obtain a training model; and
the online sequencing module is used for sequencing the online data of the alarm by using the training model to obtain an alarm rating;
the online data comprises online alarm data and online index data;
the online ranking module comprises:
the online characteristic vector extraction module is used for extracting the alarm characteristics of the online alarm data, extracting the index characteristics of the online index data and combining the alarm characteristics and the index characteristics to obtain an online characteristic vector; and
the grading module is used for inputting the online feature vector into the training model to obtain the alarm grading;
the historical data comprises work orders, alarm data and index data;
the offline training module comprises:
the label extraction module is used for extracting the label of the work order;
the characteristic vector extraction module is used for extracting the alarm characteristics of the alarm data, extracting the index characteristics of the index data and combining the alarm characteristics and the index characteristics to obtain a characteristic vector; and
and the model training module is used for inputting the labels and the feature vectors into the sequencing model and training the sequencing model to obtain the training model.
4. The dynamic alarm rating device of claim 3,
the alert feature comprises at least one of the following features: text features, text entropy, timing features; wherein the text features are obtained using a learning-based two-word topic model (BTM); the text entropy is calculated by adopting Inverse Document Frequency (IDF); the time sequence characteristics comprise the alarm frequency, the alarm period, the alarm quantity in unit time or the alarm interval time;
the index features are obtained by adopting a Long Short Term Memory (LSTM) network-based multi-time series anomaly detection algorithm.
5. An electronic device, comprising:
at least one processor; and
a memory coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to implement the method of any one of claims 1-2.
6. A computer-readable storage medium, in which a computer program is stored which, when executed, is capable of carrying out the method of any one of claims 1-2.
CN202010411127.8A 2020-05-15 2020-05-15 Dynamic alarm grading method and device, electronic equipment and storage medium Active CN111338915B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010411127.8A CN111338915B (en) 2020-05-15 2020-05-15 Dynamic alarm grading method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010411127.8A CN111338915B (en) 2020-05-15 2020-05-15 Dynamic alarm grading method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111338915A CN111338915A (en) 2020-06-26
CN111338915B true CN111338915B (en) 2020-09-01

Family

ID=71186592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010411127.8A Active CN111338915B (en) 2020-05-15 2020-05-15 Dynamic alarm grading method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111338915B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113079047B (en) * 2021-03-29 2022-10-14 北京奇艺世纪科技有限公司 Alarm processing method and device
CN113485901B (en) * 2021-07-06 2022-11-22 中国工商银行股份有限公司 System evaluation method, device, equipment and medium based on log and index
CN114090393B (en) * 2022-01-14 2022-06-03 云智慧(北京)科技有限公司 Method, device and equipment for determining alarm level
CN115658444A (en) * 2022-10-31 2023-01-31 北京泰策科技有限公司 Alarm system for adaptive rule generation based on statistical learning optimization

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105528280A (en) * 2015-11-30 2016-04-27 中电科华云信息技术有限公司 Method and system capable of determining log alarm grades according to relationship between system logs and health monitoring
CN107066302A (en) * 2017-04-28 2017-08-18 北京邮电大学 Defect inspection method, device and service terminal
CN108206747A (en) * 2016-12-16 2018-06-26 中国移动通信集团山西有限公司 Method for generating alarm and system
CN108664374A (en) * 2018-05-17 2018-10-16 腾讯科技(深圳)有限公司 Fault warning model creation method, apparatus, fault alarming method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105528280A (en) * 2015-11-30 2016-04-27 中电科华云信息技术有限公司 Method and system capable of determining log alarm grades according to relationship between system logs and health monitoring
CN108206747A (en) * 2016-12-16 2018-06-26 中国移动通信集团山西有限公司 Method for generating alarm and system
CN107066302A (en) * 2017-04-28 2017-08-18 北京邮电大学 Defect inspection method, device and service terminal
CN108664374A (en) * 2018-05-17 2018-10-16 腾讯科技(深圳)有限公司 Fault warning model creation method, apparatus, fault alarming method and device

Also Published As

Publication number Publication date
CN111338915A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN111338915B (en) Dynamic alarm grading method and device, electronic equipment and storage medium
US20200293946A1 (en) Machine learning based incident classification and resolution
Zhao et al. Automatically and adaptively identifying severe alerts for online service systems
US20200034689A1 (en) A method for retrieving a recommendation from a knowledge database of a ticketing system
US11520983B2 (en) Methods and systems for trending issue identification in text streams
US20100280981A1 (en) Information filtering system, information filtering method and information filtering program
CN111539493B (en) Alarm prediction method and device, electronic equipment and storage medium
US9270749B2 (en) Leveraging social media to assist in troubleshooting
US20220327800A1 (en) Analysis device, analysis method, and analysis program
US11113142B2 (en) Early risk detection and management in a software-defined data center
US11610136B2 (en) Predicting the disaster recovery invocation response time
CN110222513B (en) Abnormality monitoring method and device for online activities and storage medium
CN112560465B (en) Batch abnormal event monitoring method and device, electronic equipment and storage medium
CN113515434A (en) Abnormity classification method, abnormity classification device, abnormity classification equipment and storage medium
CN110895566A (en) Vehicle evaluation method and device
JP2006040292A (en) Managing feedback data
US20220318681A1 (en) System and method for scalable, interactive, collaborative topic identification and tracking
CN112182220A (en) Customer service early warning analysis method, system, equipment and medium based on deep learning
WO2019005360A1 (en) Categorizing electronic content
US20240143428A1 (en) Extended dynamic intelligent log analysis tool
CN114157553B (en) Data processing method, device, equipment and storage medium
CN115456071A (en) Fault report checking method, device, equipment and storage medium
US20220012608A1 (en) Prioritizing alerts in information technology service management systems
Takano et al. Psychological biases affecting human cognitive performance in dynamic operational environments
Shah et al. Automated log analysis and anomaly detection using machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant