CN112711665B

CN112711665B - Log anomaly detection method based on density weighted integration rule

Info

Publication number: CN112711665B
Application number: CN202110063328.8A
Authority: CN
Inventors: 应时; 刘祥瑞; 王冰明; 黄浩
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-01-18
Filing date: 2021-01-18
Publication date: 2022-04-15
Anticipated expiration: 2041-01-18
Also published as: CN112711665A

Abstract

The invention provides a log anomaly detection method based on a density weighted integration rule. The method introduces a plurality of software logs, and constructs word frequency vectors according to the software logs; obtaining normal clusters and abnormal clusters by using an improved spectral clustering method according to the word frequency vectors, calculating to obtain a normal log set and an abnormal log set, and constructing a balanced log set; the base classifier takes the balance log set as a training set, a multi-base classifier is constructed by using the trained base classifier, samples to be classified are classified by using the multi-base classifier, and the base classifier generates a classification probability vector; and according to the classification probability vector, respectively obtaining five classification results through five new integration rules, and selecting the classification result with the maximum frequency as a final classification result. The invention has the advantages that the balance of the samples is ensured, the distribution of the original data is considered, the new integration rule also considers the relation between the samples to be classified and the historical data, and the accuracy of the classification result is improved.

Description

Log anomaly detection method based on density weighted integration rule

Technical Field

The invention belongs to the field of log anomaly detection, and particularly relates to a log anomaly detection method based on a density weighted integration rule.

Background

Modern systems are evolving on a large scale, either horizontally, into complex systems built on thousands of commercial machines (e.g., Spark); or it is a vertical extension to a supercomputer with thousands of processors (such as Blue Gene/L). These systems are becoming a core part of the IT industry, and the occurrence of faults and their impact on system performance and operating costs has become a very important issue in the field of research. Complex software and systems not only contain more BUGs, but are also difficult to understand and analyze. In addition, the quality of these systems is also aging over time. These problems can lead to software crashes or system downtime.

The log may be used to obtain software information to detect and locate anomalies. Conventional system administrators examine the log data generated by the system to gain insight into the behavior of the system. However, due to the increasing size and complexity of the system, a large number of logs are generated each day. If a problem occurs, it is very time consuming for an operator to find a system problem by manually checking a large number of log messages. Therefore, the need for some automated tools for log anomaly detection is increasing.

In the log data, a normal log records a normal state of a system or software, an abnormal log records an abnormal state of the system or software, and the number of logs describing the normal state is much larger than that describing the abnormal state of the system, so unbalanced data distribution is a characteristic of the log data. Today's standard machine learning algorithms are based on the theory of balanced data, which usually perform poorly on unbalanced samples. Classifiers based on traditional machine learning algorithms typically ignore a small number of classes because these classification algorithms tend to maximize the accuracy of the overall classification. Therefore, their accuracy is not good enough for classifying the imbalance problem. By combining the classification results of multiple base classifiers, the problem can be solved by ensemble learning. However, each result of the base classifier is not accurate because it is still processing unbalanced data that is proportionally sampled from the original unbalanced data. Therefore, this problem is solved by combining some special sampling methods with the bag of words, these sampling methods include underwbagging, XGBoost and SMOTE-Bagging. These ensemble learning methods process unbalanced data into balanced samples using sampling, and a base classifier is used to balance the samples, producing a plurality of classification results. These classification results are then merged using specific integration rules. However, when an ensemble learning based method is used to detect anomalies on log data, there are two problems here:

unbalanced sample handling problems. In general, the ensemble learning method uses Bootstrap to randomly sample and obtain a balanced data set, which is a sampling method with a put-back. It will change the distribution of the raw data or overfit the classifier. Therefore, when the base classifier is used for samples obtained by these sampling methods, there still remains a problem of low accuracy.

An integration rule problem. There are five traditional integration rules, which are Max Rule, Min Rule, Product Rule, priority Rule, and Sum Rule. However, the samples to be classified are usually most relevant to the samples of a particular class in the historical data, and these traditional integration rules only merge all classification results. The accuracy of ensemble learning based anomaly detection can be improved if the relationship between the sample to be classified and the historical data can be taken into account.

Disclosure of Invention

Aiming at the research background and the problems, the invention provides a log anomaly detection method based on a density weighted integration rule.

Step 1: introducing a plurality of software logs, segmenting and analyzing each software log according to separators to obtain a software word data set, performing union processing on a plurality of software word data sets, further performing word de-duplication processing to obtain a word set, counting the frequency of each word in the word set in each log, and further constructing a software log word frequency vector;

step 2: according to the word frequency vector, extracting and determining an initial central point and an initial class number by using a multi-granularity master curve method based on improved complex distribution data spectral clustering, clustering the word frequency vectors and obtaining accurate clusters, and simultaneously obtaining the central point of each cluster, marking all samples in the cluster according to the state of the central point of each cluster, determining the states of all samples according to the state of the central point to obtain a normal cluster and an abnormal cluster, and counting the number of abnormal clusters, counting the number of samples of the abnormal clusters, calculating to obtain the number of samples of a new normal cluster, sampling the normal clusters to obtain new normal clusters, calculating the number of samples of the new abnormal clusters according to the number of samples of the new normal clusters, sampling the abnormal clusters to obtain new abnormal clusters, obtaining a normal log set and an abnormal log set, and constructing a balanced log set through the normal log set and the abnormal log set;

and step 3: the method comprises the following steps that a base classifier takes a balance log set as a training set to carry out optimization training, a multi-base classifier is constructed by using the trained base classifier, samples to be classified are classified by using the multi-base classifier, and each base classifier of the multi-base classifier can generate a classification probability vector;

and 4, step 4: according to the classification probability vector generated by each base classifier of the multi-base classifier, classification results generated by five integration rules are respectively obtained through five integration rules of MaxNW, MinNW, MajNW, ProdNW and SumNW, the five classification results are traversed, if the same classification results exist, the frequency of the classification results is increased by one, the frequency of the classification results is obtained, the classification result with the maximum frequency is selected as the final classification result, and if a plurality of classification results with the maximum frequency exist, one classification result is randomly selected as the final classification result;

preferably, each software log in step 1 is:

Log_i

i∈[1，M]

wherein, Log_iThe number of the ith software log is M, and the M is the number of the software logs;

step 1, the software word data set comprises:

wherein Data_iSoftware Word data set, Word, for the ith software log_i，jIs the jth software word, N, in the software word dataset of the ith software log_iThe number of software words in the software word data set for the ith software log, j belongs to [1, N ]_i]；

Step 1, the process of solving union set of a plurality of software word data sets comprises the following steps: { Data₁，Data₂，...，Data_M}；

Step 1, the word set obtained by the word de-duplication process is:

WordSet＝{Word₁，Word₂，...，Word_L}

wherein, Word_kIs the k-th word in the word set, L is the number of words in the word set, k is [1, L]；

Step 1, the frequency of each word in the statistical word set in each log is:

Freq_k＝{F_k，1，F_k，2，...，F_k，M}

wherein, Freq_kFor the frequency of occurrence of the kth word in the word set in each software word data set, Freq_k，iThe frequency, namely the frequency of occurrence of the k word in the word set in the software word data set of the ith software log is shown, L is the number of the words in the word set, k belongs to [1, L ∈]；

Step 1, constructing a software word frequency vector comprises the following steps:

Vector_i＝{F_1，i，F_2，i，...，F_L，i}

i∈[1，M]

wherein, Vector_iThe word frequency vector of the ith software log is obtained, and M is the number of the software logs;

preferably, the initial central point and the class in step 2 are respectively:

CenterPoint₀

Classes＝{Class₁，Class₂，...，Class_ClaNum}

wherein, CenterPoint₀As an initial center point, Class_classnumIs a ClaNum class, ClaNum belongs to [1, ClaNum]；

The precise clusters in step 2 are:

Clusters＝{Cluster₁，Cluster₂，...，Cluster_CluNum}

wherein, Cluster_clunumIs the first cluster, which belongs to [1, CluNum ∈ ]]；

Step 2, the central point of each cluster is as follows:

CenterPoints＝{CenterPoint₀，CenterPoint₁，...，CenterPoint_CluNum}

wherein, CenterPoint_clunumIs the center point of the first cluster, in particular, CenterPoint₀Is an initial center point;

step 2, the state of the central point of each cluster is as follows:

CenterPointStates＝{CPState₁，CPState₂，...，CPState_CluNum}

wherein, CPState_clunumIs the state of the center point of the first cluster, CPState_clunum∈[0，CluNum-1]，CPState_clunum0 indicates that the center point of the cluster is normal, CPState_clunumNot equal to 0, the central point of the first cluster is abnormal, the normal state is only 1, and the abnormal state is ClusterNum-1;

step 2, the samples in the cluster are:

wherein, Cluster_clunumIs the first cluster, Sample_{clunum，samnum}Samnum sample of the first cluster, Samnum_clunumAs the number of samples of the cluster number of the cluster, SamNum ═ 1, SamNum_clunum]；

Step 2, determining the states of all samples according to the states of the central points as follows:

among them, SamStates_clunumSample State set for the clusum cluster, SamState_{clunum，samnum}(iii) the samnum sample status of the first cluster, and CPState_clunumSimilarly, there is a relationship between the two:

if CPState_clunum＝0

if the state of the central point of the first cluster is normal, all the states of the samples in the first cluster are normal, and the first cluster is a normal cluster;

if CPState_clunum＝x

(x∈[1，ClusterNum-1])

if the state of the central point of the first cluster is abnormal, all the states of the samples in the first cluster are abnormal, and the first cluster is an abnormal cluster;

step 2, the normal cluster is:

NorClusters₀＝{NorCluster_0，1}

among them, NorClusters₀Indicating normal clustering, NorCluster_0，1The number of the normal clusters is only 1 in the 1 st normal cluster of the normal cluster set;

step 2, the abnormal cluster is as follows:

wherein, AbnorClusters₀A cluster of anomalies is represented that is a cluster of anomalies,

n-th of abnormal cluster_abnorAn abnormal cluster, n_abnor∈[1，N_abnor]，N_abnorThe number of abnormal clusters;

the number of the samples of the abnormal cluster in the step 2 is as follows:

wherein,

is n th_abnorThe number of samples of an outlier cluster;

the number of samples of the new normal cluster in the step 2 is as follows:

the number of samples of the new abnormal cluster in the step 2 is as follows:

extracting N from each normal cluster_norSamples, resulting in new normal clusters:

NorClusters₁＝{NorCluster_1，1}

among them, NorClusters₁For normal clustering with 1 round of sampling, NorCluster_1，1Repeating the process for the 1 st normal cluster of the normal clusters after 1 round of sampling;

extracting N from each abnormal cluster_abnorAnd (4) obtaining a new abnormal cluster by each sample:

wherein, AbnorClusters₁For an abnormal cluster over 1 round of sampling,

for the n-th abnormal cluster of 1-round sampling_abnorAn abnormal cluster, n_abnor∈[1，N_abnor]，N_abnorThis process was repeated N times for the number of abnormal clusters sampled through 1 round;

step 2, the normal log sample set is as follows:

NorSet＝{NorClusters₁，NorClusters₂，...，NorClusters_N}

among them, NorClusters_nFor normal clustering after N times of sampling, N belongs to [1, N ]]；

Step 2, the abnormal log sample set is as follows:

AbnorSet＝{AbnorClusters₁，AbnorClusters₂，...，AbnorClusters_N}

wherein, AbnorClusters_nThe abnormal cluster is obtained after n times of sampling;

step 2, the balance log set is as follows:

BalanceSet＝{BS₁，BS₂，...，BS_N}

wherein, BS_nFor balanced log sets after n samples, BS_n＝{AbnorClusters_n，NorClusters_n}。

Preferably, the base classifier in step 3 is:

MulClassifier＝{CS₁，CS₂，...，CS_CSNum}

wherein, the MulClassiier is a multi-base classifier, CS_csnumIs a CSNum base classifier, CSNum belongs to [1, CSnum ∈]；

Step 3, the samples to be classified:

S

step 3, the classification probability vector generated by each base classifier of the multi-base classifier is as follows:

CS_csnum＝{prob_csnum，1*|MKNN(S，1)|，prob_csnum，2*|MKNN(S，2)|，...，prob_{csnum，ClaNum}*|MKNN(S，ClaNum)|}

wherein, prob_{csnum，clanum}Is the probability that the csnum base classifiers classify the sample S to be classified into the ClaNum class, MKNN (S, ClaNum) (classnum belongs to [1, ClaNum ∈)]) Representing the mutual neighbor samples of the sample S to be classified in the sample cluster of the first class, | MKNN (S, class) | representing the number of the mutual neighbor samples of the sample S to be classified in the sample cluster of the first class;

preferably, the MaxNW rule in step 4 is:

obtaining a classification probability vector generated by each base classifier of the multi-base classifiers, obtaining the respective maximum classification probability of each base classifier from the generated classification probability vectors, splicing the maximum classification probabilities into a maximum classification probability set, selecting the maximum classification probability from the maximum classification probability set, and searching to obtain a class corresponding to the maximum classification probability, namely a classification result under a MaxNW rule;

the MaxNW rule describes that the respective maximum classification probability is:

max(CS_csnum)

wherein CS_csnumIs a CSNum base classifier, CSNum belongs to [1, CSnum ∈]；

The maximum classification probability set of the MaxNW rule is:

MaxPro＝{max(CS₁)，max(CS₂)，...，max(CS_CsNum)}

wherein, max (CS)_csnum) The maximum probability of the csnum base classifiers;

the maximum classification probability in the maximum classification probability set of the MaxNW rule is:

max(MaxPro)

the maximum classification probability obtained by the search according to the MaxNW rule is as follows:

wherein,

sorting samples S to be sorted to the a-th for csnum base classifiers₁Probability of individual class, a₁∈[1，ClaNum]；

MaxNW rule the classification result under the MaxNW rule is:

step 4, the MinNW rule is:

obtaining a classification probability set generated by each base classifier of the multiple base classifiers, obtaining the respective minimum classification probability of each base classifier from the generated classification probability vectors, splicing the minimum classification probabilities into a minimum classification probability set, selecting the minimum classification probability from the minimum classification probability set, and searching to obtain a class corresponding to the minimum classification probability, namely a classification result under a MinNW rule;

the MinNW rule states that the respective minimum classification probabilities are:

min(CS_csnum)

wherein CS_csnumIs a CSNum base classifier, CSNum belongs to [1, CSnum ∈]；

The MinNW rule the minimum set of classification probabilities is:

MinPro＝{min(CS₁)，min(CS₂)，...，min(CS_CSNum)}

wherein, min (CS)_csnum) The minimum probability of a csnum base classifier;

the MinNW rule specifies the minimum classification probability in the minimum classification probability set as:

min(MinPro)

the minimum classification probability obtained by the search according to the MinNW rule is:

wherein,

the csnum base classifiers classify the samples S to be classified into a₂Probability of individual class, a₂∈[1，ClaNum]；

MinNW rule the classification results under the MinNW rule are:

step 4, the MajNW rule is:

obtaining a classification probability vector generated by each base classifier of the multiple base classifiers, obtaining the respective maximum classification probability of each base classifier from the generated classification probability vectors, splicing the maximum classification probabilities into a maximum classification probability set, counting the frequency of occurrence of the class corresponding to each maximum classification probability in the maximum classification probability set to obtain a frequency set, and searching the frequency set again to obtain the class corresponding to the maximum frequency, namely the classification result under the MajNW rule;

the frequency set of the MajNW rule is:

Count＝{count₁，count₂，...，count_ClaNum}

wherein, count_clanumFor the frequency of occurrence of the ClaNum class in the maximum class probability set, ClaNum belongs to [1, ClaNum]；

MajNW rule the maximum frequency is:

max(Count)

the maximum frequency obtained by the search in the MajNW rule is:

MajNW rule the classification result under the MajNW rule is:

step 4, the ProdNW rule is as follows:

obtaining a classification probability vector generated by each base classifier of the multi-base classifiers, obtaining a classification probability set of each class according to the classification probability vector, obtaining a whole classification probability set according to the classification probability set, sequentially multiplying the classification probabilities of the classification probability sets of each class to obtain a product of the classification probability of each class, obtaining a set of products of the classification probabilities according to the product, obtaining a maximum product of the classification probability from the set, and searching the class corresponding to the maximum product of the classification probability, namely a classification result under ProdNW rules;

ProdNW rule the entire set of classification probabilities:

ClassProbability＝{CP₁，CP₂，...，CP_ClaNum}

wherein, CP_clanumIs a set of classification probabilities for the ClaNum classes, ClaNum belongs to [1, ClaNum]；

The ProdNW rule sets the classification probability of each class as:

CP_clanum＝{prob_1，clanum，prob_2，clanum，...，prob_{CSNum，clanum}}

wherein, prob_{csnum，clanum}Probability of classifying a sample S to be classified into a first class for a csnum classifier;

the ProdNW rule states that the product of the classification probabilities for each class is:

the set of products of the ProdNW rule classification probabilities is:

ProdPro＝{produce₁，produce₂，...，produce_ClaNum}

the ProdNW rule states that the product of the maximum classification probabilities is:

max(ProdPro)

ProdNW rule the product of the largest classification probabilities obtained by the search is:

ProdNW rules the classification results under the rules are:

step 4 the SumNW rule is:

obtaining a classification probability vector generated by each base classifier of a multi-base classifier, obtaining a classification probability set of each class according to the classification probability vector, obtaining a whole classification probability set according to the classification probability set, sequentially adding the classification probabilities of the classification probability sets of each class to obtain a sum of the classification probabilities of each class, obtaining a set of the sums of the classification probabilities according to the sum of the classification probabilities, obtaining the maximum sum of the classification probabilities from the set of the classification probabilities, and searching the class corresponding to the maximum sum of the classification probabilities, wherein the class is a classification result under the SumNW rule;

the sum of the classification probabilities for each class by the SumNW rule is:

the set of products of the SumNW rule classification probabilities is:

SumPro＝{sum₁，sum₂，...，sum_ClaNum}

the sum nw rule states that the product of the maximum classification probabilities is:

max(SumPro)

SumNW rule the search results in the largest product of classification probabilities:

SumNW rules the classification results under the rules are:

step 4, the classification results generated by the five integration rules are as follows:

Results＝{Result₁，Result₂，...，Result₅}

wherein Results is the classification Result set, Result_rFor the classification result obtained by the r-th integration rule, r is the [1, 5 ]]The integration rule sequence is: MaxNW, MinNW, MajNW, ProdNW and SumNW;

and 4, the frequency of the classification result is as follows:

ResNums＝{ResNum₁，ResNum₂，...，ResNum_ClaNum}

wherein ResNums is the product of integration ruleNumber set of raw classification results, ResNum_clanumIndicating the frequency of the appearance of the first class in the classification result set Results;

and 4, the classification result with the maximum frequency is as follows:

MaxResults＝{MaxResult₁，MaxResult₂，...，MaxResult_MR}

wherein, MaxResults is the classification result set with the largest frequency, MaxResult_mrRepresents the MR-th classification result with the maximum frequency, and MR belongs to [1, MR ∈]MR is more than or equal to 1 and less than or equal to 5, if MRCN is 1, the final classification result is MaxResult₁If MRCN is not equal to 1, randomly selecting a class corresponding to the maximum number from the set as a final classification result;

and 4, the final classification result is as follows:

random＝rand(1，MR)

FinalResult＝Result_random

wherein random ═ rand (1, MR) is from [1, MR]Randomly selects an integer in the interval of (1), Final Result ═ Result_randomThe random classification result is selected as the final classification result.

The invention has the advantages that the balance of the samples is ensured, the distribution of the original data is considered, the new set rule also considers the relation between the samples to be classified and the historical data, and the accuracy of the classification result is improved.

Drawings

FIG. 1: step 1 of the invention is a flow chart;

FIG. 2: step 2 of the invention is a flow chart;

FIG. 3: step 3 of the invention is a flow chart;

FIG. 4: step 4 of the invention is a flow chart;

FIG. 5: method flow chart of the invention

Detailed Description

In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.

The following describes an embodiment of the present invention with reference to fig. 1 to 5, which is a log anomaly detection method based on density weighted integration rules, and specifically includes the following steps:

step 1: introducing a plurality of software logs, segmenting and analyzing each software log according to separators to obtain a software word data set, performing union processing on a plurality of software word data sets, further performing word de-duplication processing to obtain a word set, counting the frequency of each word in the word set in each log, and further constructing a software log word frequency vector, as shown in fig. 1.

Step 1, each software log is as follows:

Log_i

i∈[1，M]

wherein, Log_iThe number of the ith software log is M, and the M is the number of the software logs; m1024;

step 1, the software word data set comprises:

wherein Data_iSoftware Word data set, Word, for the ith software log_i，jIs the jth software word, N, in the software word dataset of the ith software log_iThe number of software words in the software word data set for the ith software log, j belongs to [1, N ]_i]；N_i＝50；

Step 1, the word set obtained by the word de-duplication process is:

WordSet＝{Word₁，Word₂，...，Word_L}

wherein, Word_kIs the k-th word in the word set, L is the number of words in the word set, k is [1, L]；L＝1024；

Step 1, the frequency of each word in the statistical word set in each log is:

Freq_k＝{F_k，1，F_k，2，...，F_k，M}

Vector_i＝{F_1，i，F_2，i，...，F_L，i}

i∈[1，M]

step 2: after the word frequency vectors obtained in the step 1 are obtained, an initial central point and an initial class number are extracted and determined by using a multi-granularity master curve method based on improved complex distribution data spectral clustering, the word frequency vectors are clustered to obtain accurate clusters, the central point of each cluster is obtained at the same time, all samples in the clusters are marked according to the state of the central point of each cluster, the states of all samples are determined according to the state of the central point to obtain normal clusters and abnormal clusters, the number of the abnormal clusters is counted, the number of the samples of the new normal clusters is calculated, the new normal clusters are obtained by sampling the normal clusters, the number of the samples of the new abnormal clusters is calculated through the number of the samples of the new normal clusters, the new abnormal clusters are obtained by sampling the abnormal clusters, a normal log set and an abnormal log set are obtained, a balanced log set is constructed through the normal log set and the abnormal log set, as shown in fig. 2.

Step 2, the initial central points and the classes are respectively:

CenterPoint₀

Classes＝{Class₁，Class₂，...，Class_ClaNum}

wherein, CenterPoint₀As an initial center point, Class_classnumIs a ClaNum class, ClaNum belongs to [1, ClaNum]；ClaNum＝15；

The precise clusters in step 2 are:

Clusters＝{Cluster₁，Cluster₂，...，Cluster_CluNum}

wherein, Cluster_clunumIs the first cluster, which belongs to [1, CluNum ∈ ]]；CluNum＝15；

Step 2, the central point of each cluster is as follows:

CenterPoints＝{CenterPoint₀，CenterPoint₁，...，CenterPoint_CluNum}

wherein, CenterPoint_clunumIs the center point of the first cluster, in particular, CenterPoint_oIs an initial center point;

step 2, the state of the central point of each cluster is as follows:

CenterPointStates＝{CPState₁，CPState₂，...，CPState_CluNum}

step 2, the samples in the cluster are:

if CPState_clunum＝0

if CPState_clunum＝x

step 2, the normal cluster is:

NorClusters₀＝{NorCluster_0，1}

step 2, the abnormal cluster is as follows:

n-th of abnormal cluster_abnorAn abnormal cluster, n_abnor∈[1，N_abnor]，N_abnorThe number of abnormal clusters; n is a radical of_abnor＝14；

The number of the samples of the abnormal cluster in the step 2 is as follows:

wherein,

is n th_abnorThe number of samples of an outlier cluster;

the number of samples of the new normal cluster in the step 2 is as follows:

the number of samples of the new abnormal cluster in the step 2 is as follows:

NorClusters₁＝{NorCluster_1，1}

wherein, AbnorClusters₁For an abnormal cluster over 1 round of sampling,

step 2, the normal log sample set is as follows:

NorSet＝{NorClusters₁，NorClusters₂，...，NorClusters_N}

among them, NorClusters_nFor normal clustering after N times of sampling, N belongs to [1, N ]]；N＝10；

Step 2, the abnormal log sample set is as follows:

AbnorSet＝{AbnorClusters₁，AbnorClusters₂，...，AbnorClusters_N}

step 2, the balance log set is as follows:

BalanceSet＝{BS₁，BS₂，...，BS_N}

And step 3: the base classifier takes the balance log set as a training set to perform optimization training, the trained base classifier is used for constructing a multi-base classifier, the multi-base classifier is used for classifying samples to be classified, and each base classifier of the multi-base classifier can generate a classification probability vector as shown in fig. 3.

Step 3, the base classifier is as follows:

MulClassifier＝{CS₁，CS₂，...，CS_CSNum}

wherein, the MulClassiier is a multi-base classifier, CS_csnumIs a csnum base classifier,csnum∈[1，CSNum]；CSNum＝6；

step 3, the samples to be classified:

S

and step 3: according to the classification probability vector generated by each base classifier of the multi-base classifier, classification results generated by five integration rules are respectively obtained through five integration rules of MaxNW, MinNW, MajNW, ProdNW and SumNW, the five classification results are traversed, if the same classification results exist, the frequency of the classification results is increased by one, the frequency of the classification results is obtained, the classification result with the maximum frequency is selected as the final classification result, and if a plurality of classification results with the maximum frequency exist, one classification result is randomly selected as the final classification result;

step 4, the MaxNW rule is: obtaining a classification probability vector generated by each base classifier of the multi-base classifiers, obtaining the respective maximum classification probability of each base classifier from the generated classification probability vectors, splicing the maximum classification probabilities into a maximum classification probability set, selecting the maximum classification probability from the maximum classification probability set, and searching to obtain a class corresponding to the maximum classification probability, namely a classification result under a MaxNW rule;

max(CS_csnum)

wherein CS_csnumIs a CSNum base classifier, CSNum belongs to [1, CSnum ∈]；

The maximum classification probability set of the MaxNW rule is:

MaxPro＝{max(CS₁)，max(CS₂)，...，max(CS_CSNum)}

wherein, max (CS)_csnum) The maximum probability of the csnum base classifiers;

max(MaxPro)

wherein,

MaxNW rule the classification result under the MaxNW rule is:

step 4, the MinNW rule is: obtaining a classification probability set generated by each base classifier of the multiple base classifiers, obtaining the respective minimum classification probability by each base classifier from the generated classification probability vectors, splicing the minimum classification probabilities into a minimum classification probability set, selecting the minimum classification probability from the minimum classification probability set, and searching to obtain a class corresponding to the minimum classification probability, namely a classification result under a MinNW rule, as shown in FIG. 4.

min(CS_csnum)

wherein CS_csnumIs a CSNum base classifier, CSNum belongs to [1 ], CSNum]；

The MinNW rule the minimum set of classification probabilities is:

MinPro＝{min(CS₁)，min(CS₂)，...，min(CS_CSNum)}

wherein, min (CS)_csnum) The minimum probability of a csnum base classifier;

min(MinPro)

wherein,

MinNW rule the classification results under the MinNW rule are:

step 4, the MajNW rule is: obtaining a classification probability vector generated by each base classifier of the multiple base classifiers, obtaining the respective maximum classification probability of each base classifier from the generated classification probability vectors, splicing the maximum classification probabilities into a maximum classification probability set, counting the frequency of occurrence of the class corresponding to each maximum classification probability in the maximum classification probability set to obtain a frequency set, and searching the frequency set again to obtain the class corresponding to the maximum frequency, namely the classification result under the MajNW rule;

the frequency set of the MajNW rule is:

Count＝{count₁，count₂，...，count_ClaNum}

MajNW rule the maximum frequency is:

max(Count)

the maximum frequency obtained by the search in the MajNW rule is:

MajNW rule the classification result under the MajNW rule is:

step 4, the ProdNW rule is as follows: obtaining a classification probability vector generated by each base classifier of the multi-base classifiers, obtaining a classification probability set of each class according to the classification probability vector, obtaining a whole classification probability set according to the classification probability set, sequentially multiplying the classification probabilities of the classification probability sets of each class to obtain a product of the classification probability of each class, obtaining a set of products of the classification probabilities according to the product, obtaining a maximum product of the classification probability from the set, and searching the class corresponding to the maximum product of the classification probability, namely a classification result under ProdNW rules;

ProdNW rule the entire set of classification probabilities:

ClassProbability＝{CP₁，CP₂，...，CP_ClaNum}

The ProdNW rule sets the classification probability of each class as:

CP_clanum＝{prob_1，clanum，prob_2，clanum，...，prob_{CSNum，clanum}}

the set of products of the ProdNW rule classification probabilities is:

ProdPro＝{produce₁，produce₂，...，produce_ClaNum}

max(ProdPro)

ProdNW rules the classification results under the rules are:

step 4 the SumNW rule is: obtaining a classification probability vector generated by each base classifier of a multi-base classifier, obtaining a classification probability set of each class according to the classification probability vector, obtaining a whole classification probability set according to the classification probability set, sequentially adding the classification probabilities of the classification probability sets of each class to obtain a sum of the classification probabilities of each class, obtaining a set of the sums of the classification probabilities according to the sum of the classification probabilities, obtaining the maximum sum of the classification probabilities from the set of the classification probabilities, and searching the class corresponding to the maximum sum of the classification probabilities, wherein the class is a classification result under the SumNW rule;

the set of products of the SumNW rule classification probabilities is:

SumPro＝{sum₁，sum₂，...，sum_ClaNum}

max(SumPro)

SumNW rules the classification results under the rules are:

Results＝{Result₁，Result₂，...，Result₅}

and 4, the frequency of the classification result is as follows:

ResNums＝{ResNum₁，ResNum₂，...，ResNum_ClaNum}

where ResNums is the set of number of classification results generated by the integration rule, ResNum_clanumIndicating the frequency of the appearance of the first class in the classification result set Results;

and 4, the classification result with the maximum frequency is as follows:

MaxResults＝{MaxResult₁，MaxResult₂，...，MaxResult_MR}

wherein, MaxResults is the classification result set with the largest frequency, MaxResult_mrRepresents the mr frequencyMaximum classification result, MR ∈ [1, MR ∈ >]And MR is more than or equal to 1 and less than or equal to 5, and if MR is more than or equal to 1, the final classification result is MaxResult₁If MR ≠ 1, randomly selecting a class corresponding to the maximum number from the set as a final classification result;

and 4, the final classification result is as follows:

random＝rand(1，MR)

FinalResult＝Result_random

The result shows that the method provided by the invention can realize better anomaly detection effect.

It should be understood that parts of the specification not set forth in detail are well within the prior art.

It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A log anomaly detection method based on density weighted integration rules is characterized by comprising the following steps:

step 2, the initial central points and the classes are respectively:

CenterPoint₀

Classes＝{Class₁，Class₂，...，Class_ClaNum}

The precise clusters in step 2 are:

Clusters＝{Cluster₁，Cluster₂，...，Cluster_CluNum}

Step 2, the central point of each cluster is as follows:

CenterPoints＝{CenterPoint₀，CenterPoint₁，...，CenterPoint_CluNum}

step 2, the state of the central point of each cluster is as follows:

CenterPointStates＝{CPState₁，CPState₂，...，CPState_CluNum}

step 2, the samples in the cluster are:

if CPState_ciunum＝0

then

if CPState_clunum＝x

then

(x∈[1，ClusterNum-1])

the normal clusters are:

NorClusters₀＝{NorCluster_0，1}

the abnormal cluster is:

wherein, AbnorClusters0 represents an abnormal cluster,

the number of the samples of the abnormal cluster in the step 2 is as follows:

wherein,

is n th_abnorThe number of samples of an outlier cluster;

the number of samples of the new normal cluster in the step 2 is as follows:

the number of samples of the new abnormal cluster in the step 2 is as follows:

NorClusters₁＝{NorCluster_1，1}

wherein, AbnorClusters₁For an abnormal cluster over 1 round of sampling,

step 2, the normal log sample set is as follows:

NorSet＝{NorClusters₁，NorClusters₂，...，NorClusters_N}

Step 2, the abnormal log sample set is as follows:

AbnorSet＝{AbnorClusters₁，AbnorClusters₂，...，AbnorClusters_N}

step 2, the balance log set is as follows:

BalanceSet＝{BS₁，BS₂，...，BS_N}

wherein, BS_nTo balance the log set over n samples,

BS_n＝{AbnorClusters_n，NorClusters_n}；

step 3, the base classifier is as follows:

MulClassifier＝{CS₁，CS₂，...，CS_CSNum}

Step 3, the samples to be classified:

S

step 4, the MaxNW rule is:

max(CS_csnum)

wherein CS_csnumIs a CSNum base classifier, CSNum belongs to [1, CSnum ∈]；

The maximum classification probability set of the MaxNW rule is:

MaxPro＝{max(CS₁)，max(CS₂)，...，max(CS_CSNum)}

wherein, max (CS)_csnum) The maximum probability of the csnum base classifiers;

max(MaxPro)

wherein,

MaxNW rule the classification result under the MaxNW rule is:

step 4, the MinNW rule is:

min(CS_csnum)

wherein CS_csnumIs a CSNum base classifier, CSNum belongs to [1, CSnum ∈]；

The MinNW rule the minimum set of classification probabilities is:

MinPro＝{min(CS₁)，min(CS₂)，...，min(CS_CSNum)}

wherein, min (CS)_csnum) The minimum probability of the cSnum-th base classifier;

min(MinPro)

wherein,

MinNW rule the classification results under the MinNW rule are:

step 4, the MajNW rule is:

the frequency set of the MajNW rule is:

Count＝{count₁，count₂，...，count_ClaNum}

MajNW rule the maximum frequency is:

max(Count)

the maximum frequency obtained by the search in the MajNW rule is:

MajNW rule the classification result under the MajNW rule is:

step 4, the ProdNW rule is as follows:

ProdNW rule the entire set of classification probabilities:

ClassProbability＝{CP₁，CP₂，...，CP_ClaNum}

The ProdNW rule sets the classification probability of each class as:

CP_clanum＝{prob_1，clanum，prob_2，clanum，...，prob_{CSNum，clanum}}

the set of products of the ProdNW rule classification probabilities is:

ProdPro＝{produce₁，produce₂，...，produce_ClaNum}

max(ProdPro)

ProdNW rules the classification results under the rules are:

step 4 the SumNW rule is:

the set of products of the SumNW rule classification probabilities is:

SumPro＝{sum₁，sum₂，...，sum_ClaNum}

max(SumPro)

SumNW rules the classification results under the rules are:

Results＝{Result₁，Result₂，...，Result₅}

and 4, the frequency of the classification result is as follows:

ResNums＝{ResNum₁，ResNum₂，...，ResNum_ClaNum}

and 4, the classification result with the maximum frequency is as follows:

MaxResults＝{MaxResult₁，MaxResult₂，...，MaxResult_MR}

and 4, the final classification result is as follows:

random＝rand(1，MR)

FinalResult＝Result_random

2. The log anomaly detection method based on the density-weighted integration rule according to claim 1, wherein:

step 1, each software log is as follows:

Log_i

i∈[1，M]

step 1, the software word data set comprises:

Step 1, the word set obtained by the word de-duplication process is:

WordSet＝{Word₁，Word₂，...，Word_L}

Step 1, the frequency of each word in the statistical word set in each log is:

Freq_k＝{F_k，1，F_k，2，...，F_k，M}

Step 1, constructing a word frequency vector of the software log comprises the following steps:

Vector_i＝{F_1，i，F_2，i，...，F_L，i}

i∈[1，M]

wherein, Vector_iThe word frequency vector of the ith software log is shown, and M is the number of the software logs.