CN114647558A - Method and device for detecting log abnormity - Google Patents

Method and device for detecting log abnormity Download PDF

Info

Publication number
CN114647558A
CN114647558A CN202210173675.0A CN202210173675A CN114647558A CN 114647558 A CN114647558 A CN 114647558A CN 202210173675 A CN202210173675 A CN 202210173675A CN 114647558 A CN114647558 A CN 114647558A
Authority
CN
China
Prior art keywords
log
real
cluster
time
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210173675.0A
Other languages
Chinese (zh)
Inventor
张静
李泽州
张宪波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Information Technology Co Ltd filed Critical Jingdong Technology Information Technology Co Ltd
Priority to CN202210173675.0A priority Critical patent/CN114647558A/en
Publication of CN114647558A publication Critical patent/CN114647558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Abstract

The present disclosure provides a method and an apparatus for log anomaly detection, wherein the method comprises: performing clustering analysis on the acquired real-time cluster logs to generate corresponding label trees; matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category; the real-time cluster logs of different log exception categories are subjected to exception detection, the detection result is determined, aggregation analysis of real-time cluster logs of massive large data can be achieved, exception detection is further performed on each category of real-time cluster logs, the detection result is determined, the workload of manual troubleshooting is reduced, and the troubleshooting process is simplified.

Description

Method and device for detecting log abnormity
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting log anomalies, an electronic device, and a non-transitory computer-readable storage medium.
Background
For the abnormal detection of the cluster log, the method is a detection technology which is commonly used for a computer cluster, so that the cluster log is monitored, and problems are found in time.
In the prior art, detection of cluster logs depends on rule scripts written by an operation and maintenance engineer according to experience, and under the condition of facing massive large-data cluster logs (reaching hundred million levels every day), the mode in the prior art has the defects that coverage is omitted, and the abnormal condition that the number of the logs is continuously increased suddenly in a certain time period is difficult to take into account. Therefore, the log anomaly detection method in the prior art is slow in troubleshooting time and complicated in troubleshooting process.
Disclosure of Invention
The disclosure provides a log anomaly detection method and device, electronic equipment and a non-transitory computer readable storage medium, which are used for solving the technical problems of slow troubleshooting time and complex troubleshooting process of the log anomaly detection method in the prior art.
The present disclosure provides a method for log anomaly detection, including:
performing clustering analysis on the acquired real-time cluster logs to generate corresponding label trees;
matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and carrying out anomaly detection based on the real-time cluster logs of different log anomaly types, and determining a detection result.
According to the method for detecting log anomalies provided by an embodiment of the present disclosure, the method for generating the log template library includes:
acquiring a historical cluster log;
generating an initial label tree based on the historical cluster log;
building an initial template tree, training the initial template tree based on the initial label tree, generating a template, and generating the template into a log template library;
and performing secondary clustering on the templates, and labeling the corresponding log abnormity category for each type of template.
According to the method for detecting log abnormity provided by an embodiment of the disclosure, the method further comprises the following steps:
under the condition that the label tree is not matched with the log template library, performing similarity calculation based on the unmatched real-time cluster logs and the stored historical cluster logs corresponding to each log abnormal category to determine the log abnormal category corresponding to the unmatched real-time cluster logs;
and performing an incremental training task on the log template library based on the unmatched real-time cluster logs and the log abnormal categories corresponding to the unmatched real-time cluster logs to obtain an updated log template library.
According to the method for detecting log abnormality provided by an embodiment of the present disclosure, the method for detecting abnormality based on real-time cluster logs of different log abnormality categories, and determining a detection result includes:
converting the real-time cluster log and the historical cluster log corresponding to each log abnormal category into a time sequence index;
inputting the time sequence index into a baseline monitoring model, and outputting an abnormal predicted value corresponding to each log abnormal category;
wherein the anomaly prediction value comprises: mean value change of time sequence index, jitter frequency change, detection peak and deep valley and drop proportion value.
According to the method for detecting log abnormality provided by an embodiment of the present disclosure, the method for detecting abnormality based on real-time cluster logs of different log abnormality categories, and determining a detection result includes:
determining the proportion of real-time cluster logs of different log exception categories, and determining a first detection result according to the proportion.
According to the method for detecting log abnormality provided by an embodiment of the present disclosure, the method for detecting abnormality based on real-time cluster logs of different log abnormality categories, and determining a detection result includes:
and inputting the real-time cluster logs of different log exception categories into the sequential detection model, and outputting a second detection result.
According to the method for detecting log anomalies provided by an embodiment of the present disclosure, after determining the detection result, the method further includes:
generating a log index time sequence curve and a total log index time sequence curve of each cluster according to time sequence indexes corresponding to different cluster logs;
comparing the variation trend of the log index time series curve of each cluster with the variation trend of the total log index time series curve;
if the change trends are consistent, determining the log abnormal category with a larger proportion as a main log abnormal category based on the proportion of the real-time cluster log of the cluster in different log abnormal categories, and carrying out root cause positioning based on the main log abnormal category to determine the machine identifier with the abnormality in the cluster.
The present disclosure provides a log anomaly detection apparatus, including:
the clustering module is used for carrying out clustering analysis on the obtained real-time cluster logs to generate a corresponding label tree;
the matching module is used for matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and the detection module is used for carrying out abnormity detection on the real-time cluster logs of different log abnormity types and determining a detection result.
The present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the program to implement the steps of the method for detecting log anomalies according to any one of the above methods.
The present disclosure also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of log anomaly detection as described in any one of the above.
According to the log anomaly detection method and device, cluster logs are clustered to generate the label tree, the label tree is matched with the log template library, the log template matched with the label tree and the corresponding log anomaly category are determined, the problem of big data clusters is actively found from the perspective of the cluster logs in an online real-time matching mode, the aggregation analysis of massive real-time cluster logs of big data can be achieved, anomaly detection is further performed on each type of real-time cluster logs, the detection result is determined, the workload of manual troubleshooting is reduced, and the troubleshooting process is simplified.
Drawings
In order to more clearly illustrate the technical solutions of the present disclosure or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic structural diagram of an apparatus for log anomaly detection provided by the present disclosure;
FIG. 2 is a schematic diagram of a tag tree generated after clustering of the obtained real-time cluster logs according to the present disclosure;
FIG. 3 is a schematic diagram of a log template library provided by the present disclosure;
FIG. 4 is a second schematic structural diagram of a log anomaly detection apparatus provided by the present disclosure;
fig. 5 is a third schematic structural diagram of a log anomaly detection apparatus provided by the present disclosure;
FIG. 6 is a fourth schematic structural diagram of a log anomaly detection apparatus provided by the present disclosure;
FIG. 7 is a schematic structural diagram of an apparatus for log anomaly detection provided by the present disclosure;
fig. 8 is a schematic structural diagram of an electronic device provided by the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present disclosure, belong to the protection scope of the embodiments of the present disclosure.
For the method in the prior art, other problems may also exist, for example, it is difficult to find an abnormal event through rule matching of a big data log, some error types belong to normal service error reporting, and are not system faults, so related rules are often not configured, but under specific conditions, it is possible that no normal service error is reported, which buries potential safety hazards for normal operation of services, and manual rule scripts are difficult to completely cover these scenarios.
In order to solve the technical defects in the prior art, an embodiment of the present disclosure discloses a method for detecting log anomalies, which, referring to fig. 1, includes:
step 101, performing cluster analysis on the acquired real-time cluster logs to generate corresponding label trees.
In this embodiment, each real-time cluster log is not matched with the log template library, but a corresponding tag tree is generated after cluster analysis, and then the tag tree is matched with the log template library, so that the complicated big data cluster log troubleshooting problem is converted into the matching problem of the tag tree and the log template, the troubleshooting efficiency is improved, and the troubleshooting time is shortened.
Referring to fig. 2, fig. 2 is a label tree generated after clustering the acquired real-time cluster logs.
And 102, matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category.
The log template library comprises a plurality of log templates, and each log template has a corresponding log exception category.
The method for generating the log template library includes the following steps S21 to S24:
and S21, acquiring the historical cluster log.
And S22, generating an initial label tree based on the historical cluster logs.
S23, building an initial template tree, training the initial template tree based on the initial label tree, generating a template, and generating a log template library from the template.
And S24, performing secondary clustering on the templates, and labeling the corresponding log abnormity category for each type of template.
Through the above steps S21 to S24, a log template library is generated for matching with the tag tree for performing cluster log generation. Referring to fig. 3, fig. 3 shows a log template library of the present embodiment.
The log template library comprises log exception categories which comprise 6 categories: the method comprises the following steps that the MemStore data disk refreshing operation is delayed, GC memory recovery occurs, the heap memory utilization rate exceeds the maximum quota, a cluster is slow to process a certain table operation, the size of a data block exceeds the quota, so that cache failure is caused, and connection of a zookeeper server is overtime. After the log exception category is determined, the log exception category generally does not change, and only the real-time cluster logs need to be matched into each log exception category.
And after the matching is successful, storing the real-time cluster logs into each log exception category of the log template.
And 103, carrying out anomaly detection on the real-time cluster logs of different log anomaly types, and determining a detection result.
The dimension of the anomaly detection can be various, for example, the real-time cluster logs corresponding to each log anomaly category are subjected to anomaly detection and time sequence detection, the real-time cluster log occupation ratio of each log anomaly category is subjected to anomaly detection, and root cause positioning is performed on logs generated by different clusters.
According to the log anomaly detection method, the cluster logs are clustered to generate the label tree, the label tree is matched with the log template library, the log template matched with the label tree and the corresponding log anomaly category are determined, the problem of big data clusters is actively found from the perspective of the cluster logs in an online real-time matching mode, the aggregation analysis of massive real-time cluster logs of big data can be achieved, each type of real-time cluster logs are subjected to anomaly detection, the detection result is determined, the workload of manual troubleshooting is reduced, and the troubleshooting process is simplified.
Further, under the condition that the tag tree is not matched with the log template library, the method can further utilize the unmatched real-time cluster log to carry out incremental learning on the log template library so as to expand the log template library.
Specifically, referring to fig. 4, the method of the embodiment of the present disclosure includes steps 401 to 402:
401. and performing similarity calculation based on the unmatched real-time cluster logs and the stored historical cluster logs corresponding to each log abnormal category, and determining the log abnormal category corresponding to the unmatched real-time cluster logs.
In this embodiment, text similarity calculation needs to be performed on each real-time cluster log and the historical cluster logs, and the log exception category corresponding to each unmatched real-time cluster log is determined.
402. And performing an incremental training task on the log template library based on the unmatched real-time cluster logs and the log abnormal categories corresponding to the unmatched real-time cluster logs to obtain an updated log template library.
According to the embodiment, not only can the historical cluster logs be subjected to template training to generate a log template library with wide coverage, and the log template library is pushed to be matched in real time through an algorithm based on a label tree on line, but also the unmatched real-time cluster logs can be subjected to incremental training continuously, so that the coverage of the updated log template library is wide. The timeliness of the real-time matching algorithm based on the label tree is higher than that of the regular matching algorithm, the aggregation analysis of massive big data cluster logs can be realized, the workload of manual investigation is reduced, and after the real-time cluster logs are stored in the log template, problems can be found in advance through monitoring the log template, and damage can be stopped in time.
Further, step 103 comprises:
converting the real-time cluster log and the historical cluster log corresponding to each log abnormal category into a time sequence index;
inputting the time sequence indexes into a baseline monitoring model, and outputting an abnormal predicted value corresponding to each log abnormal category;
wherein the anomaly prediction value comprises: mean value change of time sequence index, jitter frequency change, detection peak and deep valley and drop proportion value.
In this embodiment, the baseline monitoring model may be a deep ar model, including an encoder and a decoder. Through the baseline monitoring model, a predicted value for a future period of time, such as a predicted value for 10 minutes in the future, can be obtained.
The model is based on an autoregressive principle, the true value of the previous moment is used as the characteristic of the current moment and is input into an encoder network, the predicted value of the previous moment is used as the characteristic of the current moment and is input into a decoder network, and time series prediction is carried out. The effect of the DeepAR model is evaluated by calculating the root mean square error RMSE, time sequence data of 100 time steps are predicted by using 120 time step (the time step can be selected to be 10s, 30s and 1min), the evaluation effect shows that the DeepAR model is suitable for intelligent baseline prediction after quantization indexes of the big data cluster logs, and the big data cluster logs can be better identified by adapting upper and lower limits.
Further, step 103 comprises: the method comprises the steps of determining the proportion of real-time cluster logs of different log abnormal categories, and determining a first detection result according to the proportion, so that the angle of the different log abnormal categories is taken as a consideration factor of abnormal detection.
Further, the exception detection of the logical order of the log may also be implemented, and step 103 includes: and inputting the real-time cluster logs of different log exception categories into the sequential detection model, and outputting a second detection result.
In this embodiment, the sequential detection model may be a CNN + LSTM timing model, so that the abnormality detection problem of the log is converted into a multi-class problem by using the template sequence attribute of the log, and the abnormality detection on the log logical sequence is performed.
For the CNN + LSTM model, when used, time series data indexed for a period of 6 classes of log templates were entered.
For the CNN + LSTM model, during training, input LSTM data: 128 × 6 × 5, taking a sample as an example, the data input at each time step (5 time steps in total) is 30 × 1, and a 6 × 1 output is obtained, i.e., the data structure output by the LSTM before step concat is a matrix of 6 × 1. 128 is a batch size, which is the number of samples selected in one training, and the full data can be traversed by training for multiple times.
Data input to CNN: 1 × 5 × 6, convolution kernel: 3 x 6 matrix, signature: and 128, gradually sweeping the input data by using a convolution kernel matrix, multiplying corresponding positions and then adding, and simultaneously filling the input data by using 0 to obtain 128 characteristic maps of 5 by 6.
Optionally, after determining the detection result, referring to fig. 5, the method further includes the following steps 501-503:
step 501, generating a log index time series curve and a total log index time series curve of each cluster according to the time series indexes corresponding to different cluster logs.
Step 502, comparing the variation trend of the log index time series curve of each cluster with the variation trend of the total log index time series curve.
Step 503, if the change trends are consistent, based on the ratio of the real-time cluster log of the cluster in different log exception categories, determining the log exception category with a larger ratio as a main log exception category, and performing root cause positioning based on the main log exception category to determine the machine identifier with an exception in the cluster.
In this embodiment, a root cause positioning detection model is configured for the important log template indexes, so that the sudden rise phenomenon of the total log amount can be quickly detected. Through multidimensional drilling analysis, the log index time series curve of a certain cluster is positioned to be consistent with the total amount change, and specific machine identifications causing problems of large data clusters are positioned by combining the template proportion trend change analysis of the cluster. The log quantity of the big data cluster has the trend change characteristic, the log is matched with a template, the multidimensional drilling relation exists, the machine identification combination which causes the dimension problem of the big data cluster is quickly positioned, and the problems that the problem of troubleshooting is slow and the root cause is difficult to position can be solved.
Specifically, the root cause location detection model of the embodiment screens an element set of root causes by constructing an evaluation index, determines a preliminary search space, searches by using a reinforcement learning search method to obtain a set with the highest multi-dimensional root cause probability, and corrects a final root cause. Root cause correction principle: the greater the potential score, the more likely it is that the combination of attributes is a root cause, and when two element sets have the same potential score, the less element-numbered party wins.
An embodiment of the present disclosure further provides a method for detecting log anomalies, with reference to fig. 6, including:
step 601, performing cluster analysis on the acquired real-time cluster logs to generate corresponding label trees.
Step 602, matching the label tree with the log template library, judging whether the label tree is matched with the log template library, if so, executing step 603, and if not, executing step 604.
Step 603, determining a log template matched with the label tree and a corresponding log exception category, and saving the real-time cluster log to the corresponding log template according to the log exception category.
The log template library comprises a plurality of log templates, and each log template has a corresponding log exception category.
Step 604, performing similarity calculation based on the unmatched real-time cluster logs and the stored historical cluster logs corresponding to each log abnormal category, and determining the log abnormal category corresponding to the unmatched real-time cluster logs; and performing an incremental training task on the log template library based on the unmatched real-time cluster logs and the log exception categories corresponding to the unmatched real-time cluster logs to obtain an updated log template library, and returning to the step 602.
And 605, performing anomaly detection on the real-time cluster logs of different log anomaly types to determine a detection result.
The method for detecting the abnormality comprises the following steps:
under the first condition, converting the real-time cluster logs and the historical cluster logs corresponding to the abnormal categories of each log into time sequence indexes; and inputting the time sequence index into a baseline monitoring model, and outputting an abnormal predicted value corresponding to each log abnormal category.
In this embodiment, the abnormal prediction value includes: mean value change of time sequence index, jitter frequency change, detection peak and deep valley and drop proportion value.
And under the second condition, determining the proportion of the real-time cluster logs of different log abnormal categories, and determining a first detection result according to the proportion.
And under the third condition, inputting the real-time cluster logs of different log exception categories into the sequential detection model, and outputting a second detection result.
Step 606, generating a log index time series curve and a total log index time series curve of each cluster according to the time series indexes corresponding to different cluster logs.
Step 607, comparing the variation trend of the log index time series curve of each cluster with the variation trend of the total log index time series curve.
And 608, if the change trends are consistent, determining the log abnormal category with a larger proportion as a main log abnormal category based on the proportion of the real-time cluster log of the cluster in different log abnormal categories, and performing root cause positioning based on the main log abnormal category to determine the machine identifier with the abnormality in the cluster.
According to the log sorting method and device, the complex big data cluster log problem troubleshooting process is converted into a template comparison mode, and fast clustering and global angle analysis of the logs can be achieved. The log template library is generated through the training of the FT-Tree method, log abnormal points are found through the online real-time matching of the log template to generate the time sequence indexes of log categories, problems are found in advance, and the problems that the troubleshooting time of a big data cluster is slow, the problem of the big data cluster is difficult to troubleshoot passively, and the troubleshooting process is complicated are solved.
In addition, a baseline monitoring model is configured for indexes after the weight monitoring template is quantized, and the number of continuous abnormal sudden increases in the period is found by comparing historical data, so that faults can be hit in advance.
Thirdly, by means of an online incremental log learning template method, the log can be subjected to feature extraction based on a sliding window, the correlation among templates can be found, and the abnormity of a newly added log mode combination can be detected; converting the abnormal detection problem of the log into a multi-classification problem by using the time sequence attribute of the log, and performing abnormal detection on the log logical sequence through a training sequence detection model (CNN + LSTM); and after the abnormity is found, analyzing the associated alarm by drilling the fault details, and assisting operation and maintenance personnel to analyze the specific root cause of the fault.
The embodiment of the present disclosure further includes an apparatus for detecting log anomaly, see fig. 7, including:
a clustering module 701, configured to perform clustering analysis on the obtained real-time cluster logs to generate corresponding tag trees;
a matching module 702, configured to match the tag tree with a log template library, determine a log template matched with the tag tree and a corresponding log exception category, and store the real-time cluster log into the corresponding log template according to the log exception category, where the log template library includes multiple log templates, and each log template has a corresponding log exception category;
the detecting module 703 is configured to perform anomaly detection based on the real-time cluster logs of different log anomaly categories, and determine a detection result.
Optionally, the apparatus further includes a historical template library generation module, configured to:
acquiring a historical cluster log;
generating an initial label tree based on the historical cluster log;
building an initial template tree, training the initial template tree based on the initial label tree, generating a template, and generating a log template library from the template;
and performing secondary clustering on the templates, and labeling the corresponding log abnormal type for each type of template.
Optionally, the apparatus further comprises:
the similarity calculation module is used for performing similarity calculation on the basis of the unmatched real-time cluster logs and the stored historical cluster logs corresponding to each log abnormal category under the condition that the label tree is not matched with the log template library, and determining the log abnormal category corresponding to the unmatched real-time cluster logs;
and the updating module is used for performing an incremental training task on the log template base based on the unmatched real-time cluster logs and the log abnormal categories corresponding to the unmatched real-time cluster logs to obtain an updated log template base.
Optionally, the detection module 703 is specifically configured to:
converting the real-time cluster log and the historical cluster log corresponding to each log abnormal category into a time sequence index;
inputting the time sequence index into a baseline monitoring model, and outputting an abnormal predicted value corresponding to each log abnormal category;
wherein the anomaly prediction value comprises: mean value change of time sequence index, jitter frequency change, detection peak and deep valley and drop proportion value.
Optionally, the detection module 703 is specifically configured to: determining the proportion of real-time cluster logs of different log exception categories, and determining a first detection result according to the proportion.
Optionally, the detection module 703 is specifically configured to: and inputting the real-time cluster logs of different log exception categories into the sequential detection model, and outputting a second detection result.
Optionally, the apparatus further comprises:
the curve generation module is used for generating a log index time sequence curve and a total log index time sequence curve of each cluster according to the time sequence indexes corresponding to different cluster logs after the detection result is determined;
the trend comparison module is used for comparing the change trend of the log index time series curve of each cluster with the change trend of the total log index time series curve;
and the root cause positioning module is used for determining the log abnormal category with larger percentage as a main log abnormal category based on the percentage of the real-time cluster log of the cluster in different log abnormal categories if the change trends are consistent, and performing root cause positioning based on the main log abnormal category to determine the abnormal machine identifier in the cluster.
According to the log anomaly detection device provided by the embodiment of the disclosure, cluster logs are clustered to generate the tag tree, the tag tree is matched with the log template library, and the log template matched with the tag tree and the corresponding log anomaly category are determined, so that the problem of big data clusters is actively found from the perspective of the cluster logs in an online real-time matching manner, the aggregation analysis of massive real-time cluster logs of big data can be realized, each type of real-time cluster logs are subjected to anomaly detection, the detection result is determined, the workload of manual investigation is reduced, and the troubleshooting process is simplified.
Fig. 8 illustrates a physical structure diagram of an electronic device, and as shown in fig. 8, the electronic device may include: a processor (processor)801, a communication Interface (Communications Interface)802, a memory (memory)803 and a communication bus 804, wherein the processor 801, the communication Interface 802 and the memory 803 complete communication with each other through the communication bus 804. The processor 801 may call logic instructions in the memory 803 to perform a method of log anomaly detection, comprising:
performing clustering analysis on the acquired real-time cluster logs to generate corresponding label trees;
matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and carrying out anomaly detection based on the real-time cluster logs of different log anomaly types, and determining a detection result.
In addition, the logic instructions in the memory 803 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present disclosure also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method for log anomaly detection provided by the above methods, including:
performing cluster analysis on the acquired real-time cluster logs to generate corresponding label trees;
matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and carrying out anomaly detection based on the real-time cluster logs of different log anomaly types, and determining a detection result.
In yet another aspect, the present disclosure also provides a non-transitory computer readable storage medium having stored thereon a computer program that when executed by a processor is implemented to perform the method of log anomaly detection provided above, comprising:
performing clustering analysis on the acquired real-time cluster logs to generate corresponding label trees;
matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and carrying out anomaly detection based on the real-time cluster logs of different log anomaly types, and determining a detection result.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims (10)

1. A method of log anomaly detection, comprising:
performing clustering analysis on the acquired real-time cluster logs to generate corresponding label trees;
matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and carrying out anomaly detection based on the real-time cluster logs of different log anomaly types, and determining a detection result.
2. The method for detecting log anomaly according to claim 1, wherein the method for generating the log template library comprises:
acquiring a historical cluster log;
generating an initial label tree based on the historical cluster log;
building an initial template tree, training the initial template tree based on the initial label tree, generating a template, and generating a log template library from the template;
and performing secondary clustering on the templates, and labeling the corresponding log abnormity category for each type of template.
3. The method of log anomaly detection according to claim 2, further comprising:
under the condition that the label tree is not matched with the log template base, performing similarity calculation based on the unmatched real-time cluster logs and the stored historical cluster logs corresponding to each log abnormal category to determine the log abnormal category corresponding to the unmatched real-time cluster logs;
and performing an incremental training task on the log template library based on the unmatched real-time cluster logs and the log abnormal categories corresponding to the unmatched real-time cluster logs to obtain an updated log template library.
4. The method for detecting log abnormality according to claim 1, wherein performing abnormality detection based on real-time cluster logs of different log abnormality categories, and determining a detection result comprises:
converting the real-time cluster log and the historical cluster log corresponding to each log abnormal category into a time sequence index;
inputting the time sequence index into a baseline monitoring model, and outputting an abnormal predicted value corresponding to each log abnormal category;
wherein the anomaly prediction value comprises: mean value change of time sequence index, jitter frequency change, detection peak and deep valley and drop proportion value.
5. The method for detecting log anomaly according to claim 1, wherein anomaly detection is performed based on real-time cluster logs of different log anomaly categories, and the determination of detection results comprises:
determining the proportion of real-time cluster logs of different log exception categories, and determining a first detection result according to the proportion.
6. The method for detecting log abnormality according to claim 1, wherein performing abnormality detection based on real-time cluster logs of different log abnormality categories, and determining a detection result comprises:
and inputting the real-time cluster logs of different log exception categories into the sequential detection model, and outputting a second detection result.
7. The method of log anomaly detection according to claim 1, wherein after determining a detection result, the method further comprises:
generating a log index time sequence curve and a total log index time sequence curve of each cluster according to time sequence indexes corresponding to different cluster logs;
comparing the variation trend of the log index time series curve of each cluster with the variation trend of the total log index time series curve;
if the change trends are consistent, determining the log abnormal category with a larger proportion as a main log abnormal category based on the proportion of the real-time cluster log of the cluster in different log abnormal categories, and carrying out root cause positioning based on the main log abnormal category to determine the machine identifier with the abnormality in the cluster.
8. An apparatus for log anomaly detection, comprising:
the clustering module is used for carrying out clustering analysis on the obtained real-time cluster logs to generate a corresponding label tree;
the matching module is used for matching the label tree with a log template library, determining a log template matched with the label tree and a corresponding log exception category, and storing the real-time cluster log to the corresponding log template according to the log exception category, wherein the log template library comprises a plurality of log templates, and each log template has a corresponding log exception category;
and the detection module is used for carrying out abnormity detection on the real-time cluster logs of different log abnormity types and determining a detection result.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of log anomaly detection according to any one of claims 1 to 7 when executing the program.
10. A non-transitory computer readable storage medium, having stored thereon a computer program, characterized in that the computer program, when being executed by a processor, is adapted to carry out the steps of the method of log anomaly detection according to any one of claims 1 to 7.
CN202210173675.0A 2022-02-24 2022-02-24 Method and device for detecting log abnormity Pending CN114647558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210173675.0A CN114647558A (en) 2022-02-24 2022-02-24 Method and device for detecting log abnormity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210173675.0A CN114647558A (en) 2022-02-24 2022-02-24 Method and device for detecting log abnormity

Publications (1)

Publication Number Publication Date
CN114647558A true CN114647558A (en) 2022-06-21

Family

ID=81992811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210173675.0A Pending CN114647558A (en) 2022-02-24 2022-02-24 Method and device for detecting log abnormity

Country Status (1)

Country Link
CN (1) CN114647558A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115865649A (en) * 2023-02-28 2023-03-28 网思科技股份有限公司 Intelligent operation and maintenance management control method, system and storage medium
CN117215902A (en) * 2023-11-09 2023-12-12 北京集度科技有限公司 Log analysis method, device, equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115865649A (en) * 2023-02-28 2023-03-28 网思科技股份有限公司 Intelligent operation and maintenance management control method, system and storage medium
CN115865649B (en) * 2023-02-28 2023-05-12 网思科技股份有限公司 Intelligent operation and maintenance management control method, system and storage medium
CN117215902A (en) * 2023-11-09 2023-12-12 北京集度科技有限公司 Log analysis method, device, equipment and storage medium
CN117215902B (en) * 2023-11-09 2024-03-08 北京集度科技有限公司 Log analysis method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
WO2021217855A1 (en) Abnormal root cause positioning method and apparatus, and electronic device and storage medium
CN113282461B (en) Alarm identification method and device for transmission network
CN110609759A (en) Fault root cause analysis method and device
CN111726248A (en) Alarm root cause positioning method and device
CN114647558A (en) Method and device for detecting log abnormity
CN112395170A (en) Intelligent fault analysis method, device, equipment and storage medium
CN114465874B (en) Fault prediction method, device, electronic equipment and storage medium
CN113572625B (en) Fault early warning method, early warning device, equipment and computer medium
CN113360722B (en) Fault root cause positioning method and system based on multidimensional data map
CN111539493A (en) Alarm prediction method and device, electronic equipment and storage medium
CN111027615A (en) Middleware fault early warning method and system based on machine learning
CN114430365B (en) Fault root cause analysis method, device, electronic equipment and storage medium
CN116955092B (en) Multimedia system monitoring method and system based on data analysis
CN112540887A (en) Fault drilling method and device, electronic equipment and storage medium
CN116089231A (en) Fault alarm method and device, electronic equipment and storage medium
CN111767193A (en) Server data anomaly detection method and device, storage medium and equipment
CN111865673A (en) Automatic fault management method, device and system
CN112613176A (en) Slow SQL statement prediction method and system
CN115964470B (en) Method and system for predicting service life of motorcycle accessories
CN116755974A (en) Cloud computing platform operation and maintenance method and device, electronic equipment and storage medium
CN115495587A (en) Alarm analysis method and device based on knowledge graph
CN114765574B (en) Network anomaly delimitation positioning method and device
CN114881112A (en) System anomaly detection method, device, equipment and medium
CN114168375A (en) Method for quickly positioning and eliminating database system abnormity
CN117251563A (en) Quality inspection method, equipment and storage medium for fault worksheets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination