CN111782472B - System abnormality detection method, device, equipment and storage medium - Google Patents

System abnormality detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN111782472B
CN111782472B CN202010611178.5A CN202010611178A CN111782472B CN 111782472 B CN111782472 B CN 111782472B CN 202010611178 A CN202010611178 A CN 202010611178A CN 111782472 B CN111782472 B CN 111782472B
Authority
CN
China
Prior art keywords
log
abnormal
logs
marked
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010611178.5A
Other languages
Chinese (zh)
Other versions
CN111782472A (en
Inventor
邓悦
郑立颖
徐亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010611178.5A priority Critical patent/CN111782472B/en
Priority to PCT/CN2020/118218 priority patent/WO2021139235A1/en
Publication of CN111782472A publication Critical patent/CN111782472A/en
Application granted granted Critical
Publication of CN111782472B publication Critical patent/CN111782472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis

Abstract

The invention relates to artificial intelligence and provides a system anomaly detection method, a system anomaly detection device, system anomaly detection equipment and a storage medium. The method comprises the following steps: respectively inputting the marked logs, the unmarked logs and the expanded logs of the system to be detected into three same training models in a training model set for training, and outputting probability distribution of different abnormal grades of the marked logs, the unmarked logs and the expanded logs; then calculating cross entropy loss and consistency loss output by the training model; predicting the abnormal levels of the unmarked logs and the extended logs according to the consistency loss, and iterating the training model set according to the cross entropy loss until the training model set is converged to obtain a log abnormal detection model; and finally, detecting the abnormal logs in the system operation through a log abnormality detection model. In addition, the invention also relates to a blockchain technology, and the marked logs, the unmarked logs and the extended logs can be stored in the blockchain. By optimizing the model training mode, overfitting of the model is prevented, and the difficulty in detecting abnormal points in the system by the detection model is reduced.

Description

System abnormality detection method, device, equipment and storage medium
Technical Field
The present invention relates to artificial intelligence decision making, and in particular, to a method, an apparatus, a device, and a storage medium for system anomaly detection.
Background
With the increase of system scale, the improvement of complexity and the perfection of monitoring coverage, the monitoring data volume is larger and larger, and operation and maintenance personnel cannot find quality problems from massive monitoring data. The intelligent anomaly detection is to automatically, real-timely and accurately find anomalies from monitoring data through an AI algorithm, and provide a basis for subsequent diagnosis and self-healing. The anomaly detection is a very basic but very important function in an AIOps (intelligent operation) system, and is mainly used for discovering abnormal behaviors in KPI time sequence data through automatic excavation by an algorithm and a model, so that necessary decision basis is provided for subsequent alarm, automatic loss stopping, root cause analysis and the like.
However, in an actual application scenario, normal data generally accounts for a large proportion of the total data volume, and data of abnormal points is very rare, which brings great difficulty to abnormal detection. In the training phase of the detection model, in order to ensure the positive and negative balance of the model training sample, the traditional solution idea is mainly as follows: in the process of model detection, under-sampling (discarding a part of data) a normal sample and over-sampling (repeating a part of data) an abnormal sample, wherein the normal sample loses a large amount of sample information to cause model overfitting and poor generalization capability; for the latter, simple random sampling also creates a risk of overfitting the model. Therefore, the data detection difficulty in the intelligent operation system is increased no matter the data amount of the abnormal points per se is rare or the data detection model for the abnormal points is difficult to construct accurately.
Disclosure of Invention
The invention mainly aims to solve the problem of high difficulty in anomaly detection of an intelligent operation system.
The invention provides a system anomaly detection method in a first aspect, which comprises the following steps:
acquiring a marked log and a non-marked log of a system to be detected, and expanding the non-marked log to obtain an expanded log;
respectively inputting the marked logs, the unmarked logs and the expanded logs into three identical abnormal grade training models for training, and correspondingly outputting a first probability distribution of each abnormal grade of the marked logs, a second probability distribution of each abnormal grade of the unmarked logs and a third probability distribution of each abnormal grade of the expanded logs, wherein the three identical abnormal grade training models form an abnormal grade training model set;
calculating a cross-entropy loss of the first probability distribution corresponding to a preset abnormal level marker of the marked log, and calculating a consistency loss between the second probability distribution and the third probability distribution;
predicting abnormal grade marks of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
and acquiring a log to be detected of the system to be detected, inputting the log to be detected into the log abnormity detection model for detection, outputting an abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as an analysis result of the current system operation state.
Optionally, in a first implementation manner of the first aspect of the present invention, the expanding the unmarked log to obtain an expanded log includes:
analyzing the unmarked log to obtain a plurality of log fields with different semantics;
screening key fields related to abnormal levels from the log fields according to preset semantic structure prior knowledge and the occurrence frequency of the log fields;
acquiring one or more synonymous fields corresponding to the key fields, and replacing the corresponding key fields with the synonymous fields;
and splicing the synonymous field and other log fields except the key field according to a random field processing strategy to obtain a plurality of corresponding expansion logs, wherein the random field processing strategy comprises replacing, deleting, inserting or exchanging the other log fields.
Optionally, in a second implementation manner of the first aspect of the present invention, the inputting the marked log, the unmarked log, and the extended log into three identical anomaly level training models for training respectively, and outputting a first probability distribution of each anomaly level of the marked log, a second probability distribution of each anomaly level of the unmarked log, and a third probability distribution of each anomaly level of the extended log in a corresponding manner includes:
uniformly adjusting the length of each log data in the marked log, the unmarked log and the expanded log to a preset length, and constructing a corresponding data vector;
determining the characteristic dimension of the data vector according to the length of the data vector, and extracting semantic features of the data vector according to the characteristic dimension to obtain initial semantic features;
and screening and combining the prominent features of the initial semantic features to obtain final semantic features, and calculating and outputting the probability distribution of the abnormal levels of the marked logs, the unmarked logs and the expanded logs according to the final semantic features.
Optionally, in a third implementation manner of the first aspect of the present invention, the calculating a cross-entropy loss of the first probability distribution corresponding to a preset abnormal level flag of the flag log includes:
calculating the correct prediction probability of the abnormal grade of each marked log according to the first probability distribution and the preset abnormal grade mark of the marked log;
and calculating the cross entropy loss of the first probability distribution according to preset model training parameters and the correct prediction probability so as to measure the difference between the abnormal grade prediction of the labeled log by the classification model and the real abnormal grade of the labeled log.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the iterating the abnormal level training model set according to the cross entropy loss until the abnormal level training model set converges to obtain a log abnormal detection model includes:
determining the correct prediction probability corresponding to each marked log according to the cross entropy loss;
judging whether a correct prediction probability larger than a preset probability threshold exists or not;
if so, deleting the first probability distribution corresponding to the correct prediction probability which is greater than the probability threshold value, and continuing to iterate the log anomaly detection model, otherwise, directly iterating the log anomaly detection model, and updating the model training parameters after the log anomaly detection model is iterated;
calculating the sum of the cross entropy loss and the consistency loss to obtain a corresponding final loss value, and judging whether the final loss value is smaller than a preset final loss threshold value or not;
and if the final loss value is smaller than the final loss threshold value, the abnormal level training model set is converged and stops iteration to obtain a log abnormal detection model.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the calculation formula of the correct prediction probability is:
Figure BDA0002561847440000031
and is
Figure BDA0002561847440000032
Wherein, said etatIs a probability threshold value, thetFor a growth coefficient, K is the number of the abnormal grade categories, T is the current iteration number, and T is the preset total iteration number;
when the data amount in the marking log is smaller than a preset normal data amount range,
Figure BDA0002561847440000033
when the data amount in the marked log is larger than the normal data amount range, the data amount is recorded in the marked log
Figure BDA0002561847440000034
Optionally, in a sixth implementation manner of the first aspect of the present invention, the acquiring a log to be detected of a system to be detected, inputting the log to be detected into the log anomaly detection model for detection, outputting an anomaly level corresponding to the log to be detected, and taking the anomaly level corresponding to the log to be detected as an analysis result of the current system operation state includes:
acquiring a log to be detected of a system to be detected, wherein the log to be detected comprises a plurality of pieces of log information, and the log information comprises identification information of system operation management priority;
inputting the log to be detected into the log abnormity detection model for detection, and predicting the abnormity grade of the log to be detected through the log abnormity detection model;
screening logs to be detected with abnormal grades higher than a preset abnormal grade threshold value, and determining log information with priority higher than a preset priority threshold value from the screened logs to be detected according to the identification information;
and highlighting the log information with the priority greater than a preset priority threshold value, and taking the abnormal grade corresponding to the highlighted log information and other log information except the highlighted log information as an analysis result of the current system running state.
A second aspect of the present invention provides a system abnormality detection apparatus, including:
the acquisition module is used for acquiring a marked log and a non-marked log of the system to be detected and expanding the non-marked log to obtain an expanded log;
a training module, configured to input the marked log, the unmarked log, and the extended log into three identical abnormal level training models for training, and output a first probability distribution of each abnormal level of the marked log, a second probability distribution of each abnormal level of the unmarked log, and a third probability distribution of each abnormal level of the extended log, where the three identical abnormal level training models form an abnormal level training model set;
a calculation module, configured to calculate a cross entropy loss between the first probability distribution and a preset abnormal level flag of the flag log, and calculate a consistency loss between the second probability distribution and the third probability distribution;
the generation module is used for predicting the abnormal grade marks of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
the detection module is used for acquiring the log to be detected of the system to be detected, inputting the log to be detected into the log abnormity detection model for detection, outputting the abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as the analysis result of the current system running state.
Optionally, in a first implementation manner of the second aspect of the present invention, the obtaining module is further configured to:
analyzing the unmarked log to obtain a plurality of log fields with different semantics;
screening key fields related to abnormal levels from the log fields according to preset semantic structure prior knowledge and the occurrence frequency of the log fields;
acquiring one or more synonymous fields corresponding to the key fields, and replacing the corresponding key fields with the synonymous fields;
and splicing the synonymous field and other log fields except the key field according to a random field processing strategy to obtain a plurality of corresponding expansion logs, wherein the random field processing strategy comprises replacing, deleting, inserting or exchanging the other log fields.
Optionally, in a second implementation manner of the second aspect of the present invention, the training module further includes:
the construction unit is used for uniformly adjusting the length of each log data in the marked log, the unmarked log and the expanded log to a preset length and constructing a corresponding data vector;
the feature extraction unit is used for determining the feature dimension of the data vector according to the length of the data vector and extracting the semantic features of the data vector according to the feature dimension to obtain initial semantic features;
and the probability distribution generating unit is used for screening and combining the prominent features of the initial semantic features to obtain final semantic features, and calculating and outputting the probability distribution of the abnormal levels of the marked logs, the unmarked logs and the extended logs according to the final semantic features.
Optionally, in a third implementation manner of the second aspect of the present invention, the calculation module further includes:
the first calculation unit is used for calculating the correct prediction probability of the abnormal grade of each marked log according to the first probability distribution and the preset abnormal grade marks of the marked logs;
and the second calculation unit is used for calculating the cross entropy loss of the first probability distribution according to preset model training parameters and the correct prediction probability so as to measure the difference between the abnormal grade prediction of the labeled log by the classification model and the real abnormal grade of the labeled log.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the generating module further includes:
the iteration unit is used for determining the correct prediction probability corresponding to each marked log according to the cross entropy loss; judging whether a correct prediction probability larger than a preset probability threshold exists or not; if so, deleting the first probability distribution corresponding to the correct prediction probability which is greater than the probability threshold value, and continuing to iterate the log anomaly detection model, otherwise, directly iterating the log anomaly detection model, and updating the model training parameters after the log anomaly detection model is iterated;
the model generation unit is used for calculating the sum of the cross entropy loss and the consistency loss to obtain a corresponding final loss value and judging whether the final loss value is smaller than a preset final loss threshold value or not; and if the final loss value is smaller than the final loss threshold value, the abnormal level training model set is converged and stops iteration to obtain a log abnormal detection model.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the calculation formula of the correct prediction probability is:
Figure BDA0002561847440000041
and is
Figure BDA0002561847440000042
Wherein, said etatIs a probability threshold value, thetFor a growth coefficient, K is the number of the abnormal grade categories, T is the current iteration number, and T is the preset total iteration number;
when the data amount in the marking log is smaller than a preset normal data amount range,
Figure BDA0002561847440000043
when the data amount in the marked log is larger than the normal data amount range, the data amount is recorded in the marked log
Figure BDA0002561847440000044
Optionally, in a sixth implementation manner of the second aspect of the present invention, the detection module further includes:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a log to be detected of a system to be detected, wherein the log to be detected comprises a plurality of pieces of log information, and the log information comprises identification information of system operation management priority;
the detection unit is used for inputting the log to be detected into the log abnormity detection model for detection and predicting the abnormity grade of the log to be detected through the log abnormity detection model;
the screening unit is used for screening the logs to be detected with the abnormal grade higher than a preset abnormal grade threshold value, and determining log information with the priority higher than a preset priority threshold value from the screened logs to be detected according to the identification information;
and the analysis result generation unit is used for highlighting the log information with the priority level larger than a preset priority threshold value, and taking the abnormal grade corresponding to the highlighted log information and other log information except the highlighted log information as the analysis result of the current system operation state.
A third aspect of the present invention provides a priority-based resource allocation apparatus, including: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line; the at least one processor invokes the instructions in the memory to cause the system anomaly detection device to perform the system anomaly detection method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the above-described system abnormality detection method.
In the technical scheme provided by the invention, a marked log, a non-marked log and an expanded log of a system to be detected are obtained and are respectively input into three identical abnormal grade training models in an abnormal grade training model set for training, and the probability distribution of each abnormal grade of the three abnormal grade training models is output; then calculating cross entropy loss and consistency loss output by the abnormal level training model; predicting the abnormal levels of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal level training model set according to the cross entropy loss until the abnormal level training model set is converged to obtain a log abnormal detection model; and finally, detecting the abnormal logs in the system operation through a log abnormality detection model. And the model training mode is optimized, model overfitting is prevented, and the difficulty in detecting abnormal points in the system by the detection model is reduced.
Drawings
FIG. 1 is a schematic diagram of a first embodiment of the system abnormality detection method of the present invention;
FIG. 2 is a schematic diagram of a system abnormality detection method according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of a system abnormality detection method according to a third embodiment of the present invention;
FIG. 4 is a schematic diagram of a fourth embodiment of the system abnormality detection method according to the present invention;
FIG. 5 is a schematic diagram of an embodiment of the system abnormality detection apparatus of the present invention;
FIG. 6 is a schematic diagram of another embodiment of the system abnormality detection apparatus of the present invention;
fig. 7 is a schematic diagram of an embodiment of the system abnormality detection apparatus of the present invention.
Detailed Description
The embodiment of the invention provides a system anomaly detection method, a device, equipment and a storage medium, which comprises the steps of obtaining a marked log, a non-marked log and an expanded log of a system to be detected, respectively inputting the marked log, the non-marked log and the expanded log into three same anomaly level training models in an anomaly level training model set for training, and outputting probability distribution of each anomaly level of the marked log, the non-marked log and the expanded log; then calculating cross entropy loss and consistency loss output by the abnormal level training model; predicting the abnormal levels of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal level training model set according to the cross entropy loss until the abnormal level training model set is converged to obtain a log abnormal detection model; and finally, detecting the abnormal logs in the system operation through a log abnormality detection model. And the model training mode is optimized, model overfitting is prevented, and the difficulty in detecting abnormal points in the system by the detection model is reduced.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow of the embodiment of the present invention is described below, and referring to fig. 1, a first embodiment of the system anomaly detection method in the embodiment of the present invention includes:
101. acquiring a marked log and a non-marked log of a system to be detected, and expanding the non-marked log to obtain an expanded log;
it is to be understood that the execution subject of the present invention may be a system abnormality detection apparatus, and may also be a terminal or a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject. It is emphasized that, in order to further ensure the privacy and security of the marked log, the unmarked log and the extended log, the marked log, the unmarked log and the extended log may also be stored in a node of a block chain.
In this embodiment, a log generated during past operation of the system may be acquired from the system memory, where the log is text information for recording a system state and an operation state, and the content includes a timestamp and text information indicating a transmission content.
In the practical application scenario in this embodiment, a small number of marked logs are used to predict a large number of unmarked logs, and the exception level of the unmarked logs is obtained. Obtaining logs generated by a system in the past as unmarked logs, and then screening a small number of logs from the unmarked logs to perform abnormal grade marking to be used as marked logs; on the other hand, in the model training process, because the number of normal samples is significantly higher than that of abnormal samples, and in order to ensure the positive and negative balance of the training samples and facilitate the stability of the model training, the unmarked logs need to be expanded, and the number of negative samples is increased, the expanded logs are obtained by replacing the same semantic words with the keywords describing the system running state in the marked logs.
Preferably, the abnormal unmarked log can be expanded by adopting a translation and TD-IDF (term frequency-inverse text frequency index) alternative word method, firstly, the importance degree of each field in the unmarked log to the unmarked log is evaluated by the TD-IDF, specifically, the field with high occurrence frequency in the unmarked log is used as the key field of the unmarked log, then the key fields in different unmarked logs are classified according to semantics by DBPedia priori knowledge to obtain the key fields of a plurality of categories, and the key fields are replaced by synonyms with the same semantics, and finally other non-key fields are processed in modes of replacement, deletion, insertion, exchange and the like, the number of abnormal unmarked logs can be expanded, the same semantics can be ensured when the information content is abnormal, and the expanded logs are obtained.
102. Respectively inputting the marked logs, the unmarked logs and the expanded logs into three identical abnormal grade training models for training, and correspondingly outputting a first probability distribution of each abnormal grade of the marked logs, a second probability distribution of each abnormal grade of the unmarked logs and a third probability distribution of each abnormal grade of the expanded logs, wherein the three identical abnormal grade training models form an abnormal grade training model set;
in this embodiment, the abnormal class training model set is formed by stacking three identical abnormal class training models, and the training process of each abnormal class training model specifically includes:
uniformly adjusting the length of each log data in the marked log, the unmarked log and the expanded log to a preset length, and constructing a corresponding data vector;
determining the characteristic dimension of the data vector according to the length of the data vector, and extracting semantic features of the data vector according to the characteristic dimension to obtain initial semantic features;
and screening and combining the prominent features of the initial semantic features to obtain final semantic features, and calculating and outputting the probability distribution of the abnormal levels of the marked logs, the unmarked logs and the expanded logs according to the final semantic features.
In this embodiment, the training mode in which the labeled log is input into the abnormal level training model for training belongs to supervised learning, and the training mode in which the unlabeled log, the extended log are input into the abnormal level training model and the abnormal level training model for training belongs to unsupervised learning. It should be noted that, here, according to the feature distribution of different fields of the system log, the probabilities of different exception levels are correspondingly output, and finally, the exception level with the highest exception level probability is selected as the exception level of the system log, instead of directly outputting the exception level.
Preferably, Text-CNN (Text-Convolutional Neural Network) is used herein to train the abnormal level training model corresponding to the labeled log in a supervised learning manner and the abnormal level training model corresponding to the unlabeled log and the augmented data in an unsupervised learning manner. The method specifically comprises the following steps:
the input layer adjusts the Text vocabularies in the marked logs, the unmarked logs or the extended logs in the input Text-CNN model to be L with the same length to obtain word vectors of each Text vocabulary;
the convolution layer is used for extracting the characteristic vocabulary describing the grade abnormality of the word vector by using a plurality of convolution kernels with different sizes by taking the category number of the abnormal grade as the dimension of an abnormal grade training model;
the pooling layer combines different characteristic vocabularies obtained by the convolutional layer by Max-pool (maximum pooling) to be used as classification characteristics of different system logs;
and the full connection layer inputs the classification characteristics into an LR (logical Regression) classifier for classification, for example, if the abnormal levels in the set output rules include major abnormality, common abnormality, slight abnormality and normality, the probability of each abnormal level of different system logs is output.
And finally, according to the abnormal grade probabilities output by the model, taking the abnormal grade with the highest probability as a prediction result of the abnormal grade of the current system log. For example, in the anomaly level of the unmarked log A of the input Text-CNN, the probabilities of major anomaly, common anomaly, slight anomaly and normal are respectively [0.5,0.2,0.2 and 0.1], and then the anomaly level of the unmarked log A is predicted to be the major anomaly through the model.
103. Calculating a cross-entropy loss of the first probability distribution corresponding to a preset abnormal level marker of the marked log, and calculating a consistency loss between the second probability distribution and the third probability distribution;
in this embodiment, the cross entropy loss represents a difference value between the prediction of the first probability distribution of the marked log and the real level thereof, and the consistency loss represents a difference value between the unmarked log and the corresponding extended log. Finally, the unmarked log to be detected can be directly input, and the abnormal grade of the log can be directly predicted.
Specifically, the unmarked logs and the abnormal level training models corresponding to the extended logs are stacked, consistency training is performed, iteration upgrade of the abnormal level training model set is controlled through a consistency loss function, along with model training and iteration, the more concentrated the characteristics of the unmarked logs and the corresponding extended logs are, the higher the similarity is, the smaller the similarity distance between the two models is, the corresponding consistency loss is reduced, and when the similarity distance is reduced to a preset threshold value, the corresponding abnormal level labels are spread to the extended logs, so that the abnormal level of the unmarked logs is obtained. Wherein the calculation of the consistency loss is calculated by the following function:
Figure BDA0002561847440000071
in addition, the difference between the prediction result of the abnormal grade probability distribution output by the abnormal grade training model corresponding to the marked log and the actual abnormal grade is measured by using cross entropy loss; the cross entropy is calculated by the following cross entropy function:
Figure BDA0002561847440000072
wherein p θ (y)*| x) represents the probability of predicting the mark log correctly, and p θ (y) is calculated when the probability distribution of certain mark data is calculated when training reaches t steps*| x) is greater than a threshold ηtThen the marker data is removed from the loss function. Here, the
Figure BDA0002561847440000073
In the normal case of the operation of the device,
Figure BDA0002561847440000074
104. predicting abnormal grade marks of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
in this embodiment, consistency loss and cross entropy loss are combinedEvaluating log anomaly detection models, i.e.
Figure BDA0002561847440000081
Wherein theta is a preset model parameter,
Figure BDA0002561847440000082
and for the final loss, lambda is used for balancing consistency loss and cross entropy loss, and when the final loss is smaller than a preset threshold value, the log anomaly detection model stops iteration.
When the consistency loss is smaller than a preset consistency loss threshold value, the prediction credibility of the abnormal grade marks of the unmarked logs and the expanded logs can be determined; when the cross entropy loss is smaller than the preset cross entropy loss threshold, the probability distribution of each abnormal level output in the step 102 can be determined to be credible, and when the final loss obtained by adding the cross entropy loss and the probability distribution is smaller than the final loss threshold, the output result of the whole abnormal level training model set can be shown to be credible.
105. And acquiring a log to be detected of the system to be detected, inputting the log to be detected into the log abnormity detection model for detection, outputting an abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as an analysis result of the current system operation state.
In this embodiment, the logs to be detected can be obtained from one or more systems, different operating states of different systems are managed in priority, an operating condition in which a major abnormality is likely to occur is focused on in the monitoring process, and for an abnormal log with a high priority, once a major abnormality occurs, emergency measures need to be taken in time to quickly respond, locate a specific fault reason, and remove the fault. Therefore, the log to be detected is provided with identification information with priority, when the abnormal level output by the input log abnormal detection model is higher, whether the log to be detected is the log to be detected with high identification priority is judged, if yes, the log to be detected is highlighted when being recorded in the decomposition result, and if necessary, an alarm is given.
In the embodiment of the invention, the marked logs and the unmarked logs are obtained, the unmarked logs are expanded to obtain expanded logs, and the expanded logs are respectively input into three corresponding abnormal grade training models in an abnormal grade training model set for training so as to predict the probability distribution of each abnormal grade of the three abnormal grade training models; the abnormal grade training model set is subjected to iterative training in a mode of gradually reducing the marking information, and the finally generated abnormal log detection model can be used for predicting the abnormal grade corresponding to the log to be detected generated in the system operation process to obtain the analysis result of the system operation state, so that the over-fitting resisting strength of the model is improved, and the detection difficulty of the model is reduced.
Referring to fig. 2, a second embodiment of the system anomaly detection method according to the embodiment of the present invention includes:
201. acquiring a marked log and a non-marked log of a system to be detected;
202. analyzing the unmarked log to obtain a plurality of log fields with different semantics;
the embodiment mainly introduces the expansion of the unmarked log to obtain the expanded log. The unmarked log can be a system log generated in an abnormal state of a system, the log content comprises time, session identification, function identification, refining content and other information, such as system version number, thread number and log level, such as DEBUG, INFO, WARM, ERROR and the like, and fields with different semantics of the unmarked log are analyzed to obtain a plurality of semantic fields, obviously, the log level in the log content is a target key field to be obtained.
203. Screening key fields related to abnormal levels from the log fields according to preset semantic structure prior knowledge and the occurrence frequency of the log fields;
in the embodiment, preset semantic structure prior knowledge is used for associating key fields with the same semantics in each unmarked log, wherein the same semantics refer to expressed content meanings with the same abnormal level; then, the occurrence frequency of each semantic field in the same unmarked log is counted, the occurrence frequency of each semantic field in all unmarked logs is counted, the product of the two is calculated, and according to the calculation result and the set threshold value, which fields are key fields, namely, the fields which represent the abnormal level of the unmarked logs can be screened.
Preferably, a TF-IDF (Term Frequency-Inverse text Frequency index) technique may be used to determine the key field, and if the unmarked log is a system log, when the system log contains 100 fields and the number of occurrences of the field a is 15, the TF of the field is: 15/100 ═ 0.15; and the number of the system logs adopted by the training is 10 ten-thousandth, and if the field a appears in 1 thousand of system logs, the IDF of the field a is as follows: lg (100000/1000) ═ 3, then the TF-IDF of field a is: 0.15 × 3 is 0.45, and if the TF-IDF threshold set as the key field is 0.4, it is determined that the field a is the key field.
204. Acquiring one or more synonymous fields corresponding to the key fields, and replacing the corresponding key fields with the synonymous fields;
205. according to a random field processing strategy, splicing the synonymous field with other log fields except the key field to obtain a plurality of corresponding expansion logs, wherein the random field processing strategy comprises replacing, deleting, inserting or exchanging the other log fields;
in this embodiment, after the key fields in the unmarked log are confirmed, the unmarked log is extended in a retranslation manner. Firstly, the content expression of the unmarked log is required to be kept the same, the unmarked log is realized through other multiple synonymous fields, and then the difference between the whole content of the expanded log and the unmarked log is required to be ensured, so that other semantic fields except the key fields are required to be processed, including modes of replacement, deletion, insertion, exchange and the like. And then splicing the processed key field with other fields to obtain a plurality of expansion logs with the same meaning and different contents.
206. Respectively inputting the marked logs, the unmarked logs and the expanded logs into three identical abnormal grade training models for training, and correspondingly outputting a first probability distribution of each abnormal grade of the marked logs, a second probability distribution of each abnormal grade of the unmarked logs and a third probability distribution of each abnormal grade of the expanded logs, wherein the three identical abnormal grade training models form an abnormal grade training model set;
207. calculating a cross-entropy loss of the first probability distribution corresponding to a preset abnormal level marker of the marked log, and calculating a consistency loss between the second probability distribution and the third probability distribution;
208. predicting abnormal grade marks of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
209. and acquiring a log to be detected of the system to be detected, inputting the log to be detected into the log abnormity detection model for detection, outputting an abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as an analysis result of the current system operation state.
In the embodiment of the invention, unmarked logs generated under the condition of system abnormity are introduced to be expanded, the number of the abnormal unmarked logs is increased under the condition of ensuring the data difference, the over-fitting resisting capability of the model is increased in the following training process of the detection model, and the difficulty of the training of the detection model is reduced.
Referring to fig. 3, a third embodiment of the system anomaly detection method according to the embodiment of the present invention includes:
301. acquiring a marked log and a non-marked log of a system to be detected, and expanding the non-marked log to obtain an expanded log;
302. respectively inputting the marked logs, the unmarked logs and the expanded logs into three identical abnormal grade training models for training, and correspondingly outputting a first probability distribution of each abnormal grade of the marked logs, a second probability distribution of each abnormal grade of the unmarked logs and a third probability distribution of each abnormal grade of the expanded logs, wherein the three identical abnormal grade training models form an abnormal grade training model set;
303. calculating the correct prediction probability of the abnormal grade of each marked log according to the first probability distribution and the preset abnormal grade mark of the marked log;
in this embodiment, the probability of correct prediction is calculated by using the abnormal level probability distribution of the abnormal level training model for predicting the labeled log and the real abnormal level of the labeled log, and the calculation formula is as follows:
Figure BDA0002561847440000101
wherein, p (y)i) To mark the ith anomaly level probability in the log, q (y)i) To mark the ith anomaly level probability in the log, it should be noted that, here, the true anomaly level probability is 1, and the other anomaly level probabilities are 0, so the specific calculation method is as follows: log (q (y)i))。
Specifically, if the abnormality level is classified as: major anomaly, common anomaly, slight anomaly and normal are respectively represented by 1, 2, 3 and 4, Z is the probability distribution of the anomaly level of the marked log, if the probabilities of taking 1, 2, 3 and 4 in Z-P are respectively [0.5,0.2,0.2 and 0.1], and the real anomaly level of the marked log is major anomaly, the corresponding correct prediction probability is as follows: -1xlog (0.5) ═ 0301.
304. Calculating cross entropy loss of the first probability distribution according to preset model training parameters and the correct prediction probability, so as to measure the difference between the abnormal grade prediction of the labeled log by a classification model and the real abnormal grade of the labeled log;
in this embodiment, the cross entropy loss of the first probability distribution can be obtained by accumulating and averaging the correct prediction probabilities corresponding to all the first probability distributions. The classification accuracy of the abnormal grade training model corresponding to the marked log can be evaluated through the cross entropy loss, namely the quantitative difference index between the classification result and the real result.
305. Calculating a loss of consistency between the second probability distribution and the third probability distribution;
306. predicting the abnormal grades of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
307. and acquiring a log to be detected of the system to be detected, inputting the log to be detected into the log abnormity detection model for detection, outputting an abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as an analysis result of the current system operation state.
In the embodiment of the invention, the cross entropy loss of the first probability distribution is calculated to be used for calculating the final loss by subsequently combining the consistency loss, and the abnormal log detection model is evaluated to be used as one of indexes for measuring the abnormal log detection model.
Referring to fig. 4, a fourth embodiment of the system anomaly detection method according to the embodiment of the present invention includes:
401. acquiring a marked log and a non-marked log of a system to be detected, and expanding the non-marked log to obtain an expanded log;
402. respectively inputting the marked logs, the unmarked logs and the expanded logs into three identical abnormal grade training models for training, and correspondingly outputting a first probability distribution of each abnormal grade of the marked logs, a second probability distribution of each abnormal grade of the unmarked logs and a third probability distribution of each abnormal grade of the expanded logs, wherein the three identical abnormal grade training models form an abnormal grade training model set;
403. calculating a cross-entropy loss between the first probability distribution and a preset anomaly level of the marked log, and calculating a consistency loss between the second probability distribution and the third probability distribution;
404. determining the correct prediction probability corresponding to each marked log according to the cross entropy loss, and judging whether the correct prediction probability larger than a preset probability threshold exists or not;
in this embodiment, in the iterative training process of the log anomaly detection model, labeled logs of labels need to be deleted step by step, and overfitting of the model is prevented in a training signal annealing manner, so that the generalization capability of the model is increased. And when the correct prediction probability is larger than the set probability threshold, deleting the marked log corresponding to the correct prediction probability.
Specifically, when the amount of data with a mark in the mark log is normal, the probability threshold calculation formula is as follows:
Figure BDA0002561847440000111
when the amount of data with marks in the mark log is small, the model is easy to be over-fitted, and the model can make high-probability prediction according to the data in a short time, so that the probability threshold value calculation formula is converted into:
Figure BDA0002561847440000112
reducing the rate of increase of the threshold so as to remove more invalid samples; when the amount of data with marks in the mark log is large, the model is difficult to over-fit, the model takes a long time to converge, the high-probability prediction samples output by the model in the same time are fewer, and the samples needing to be deleted are fewer, so that the probability threshold calculation formula can be converted into:
Figure BDA0002561847440000113
increasing the threshold growth rate, the number of deleted samples is decreased.
405. If so, deleting the first probability distribution corresponding to the correct prediction probability which is greater than the probability threshold value, and continuing to iterate the log anomaly detection model, otherwise, directly iterating the log anomaly detection model, and updating the model training parameters after the log anomaly detection model is iterated;
in this embodiment, iteration is performed on the log anomaly detection model in a training signal annealing manner, and labeled logs which easily cause overfitting of the model are deleted step by step until the final loss is smaller than a set threshold, so that it can be confirmed that the log anomaly detection model can be used in detection practice.
406. Calculating the sum of the cross entropy loss and the consistency loss to obtain a corresponding final loss value, and judging whether the final loss value is smaller than a preset final loss threshold value or not;
407. if the final loss value is smaller than the final loss threshold value, the abnormal level training model set is converged and iteration is stopped, and a log abnormal detection model is obtained;
in this embodiment, the cross entropy loss and the consistency loss are combined to evaluate the correct prediction probability of the log anomaly detection model, which is used as a criterion for model iteration, and here, only the two are added, that is:
Figure BDA0002561847440000114
408. and acquiring a log to be detected of the system to be detected, inputting the log to be detected into the log abnormity detection model for detection, outputting an abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as an analysis result of the current system operation state.
In the embodiment of the invention, in the iterative training process of the log anomaly detection model, the first probability distribution of training is gradually deleted along with the increase of unmarked data, and the overfitting risk can be effectively resisted by the training signal annealing method.
With reference to fig. 5, the system anomaly detection method in the embodiment of the present invention is described above, and a system anomaly detection apparatus in the embodiment of the present invention is described below, where an embodiment of the system anomaly detection apparatus in the embodiment of the present invention includes:
an obtaining module 501, configured to obtain a marked log and a unmarked log of a system to be detected, and expand the unmarked log to obtain an expanded log;
a training module 502, configured to input the marked log, the unmarked log, and the extended log into three identical abnormal level training models for training, and output a first probability distribution of each abnormal level of the marked log, a second probability distribution of each abnormal level of the unmarked log, and a third probability distribution of each abnormal level of the extended log, where the three identical abnormal level training models form an abnormal level training model set;
a calculating module 503, configured to calculate a cross entropy loss between the first probability distribution and a preset abnormal level flag of the flag log, and calculate a consistency loss between the second probability distribution and the third probability distribution;
a generating module 504, configured to predict an abnormal level flag of the unmarked log and the extended log according to the consistency loss, and iterate the abnormal level training model set according to the cross entropy loss until the abnormal level training model set converges to obtain a log abnormality detection model;
the detection module 505 is configured to obtain a log to be detected of the system to be detected, input the log to be detected into the log anomaly detection model for detection, output an anomaly level corresponding to the log to be detected, and use the anomaly level corresponding to the log to be detected as an analysis result of the current system operation state.
In the embodiment of the invention, a marked log, a non-marked log and an expanded log of a system to be detected are obtained and are respectively input into three same abnormal grade training models in an abnormal grade training model set for training, and the probability distribution of each abnormal grade of the three is output; then calculating cross entropy loss and consistency loss output by the abnormal level training model; predicting the abnormal levels of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal level training model set according to the cross entropy loss until the abnormal level training model set is converged to obtain a log abnormal detection model; and finally, detecting the abnormal logs in the system operation through a log abnormality detection model. And the model training mode is optimized, model overfitting is prevented, and the difficulty in detecting abnormal points in the system by the detection model is reduced.
Referring to fig. 6, another embodiment of the system abnormality detection apparatus according to the embodiment of the present invention includes:
an obtaining module 501, configured to obtain a marked log and a unmarked log of a system to be detected, and expand the unmarked log to obtain an expanded log;
a training module 502, configured to input the marked log, the unmarked log, and the extended log into three identical abnormal level training models for training, and output a first probability distribution of each abnormal level of the marked log, a second probability distribution of each abnormal level of the unmarked log, and a third probability distribution of each abnormal level of the extended log, where the three identical abnormal level training models form an abnormal level training model set;
a calculating module 503, configured to calculate a cross entropy loss between the first probability distribution and a preset abnormal level flag of the flag log, and calculate a consistency loss between the second probability distribution and the third probability distribution;
a generating module 504, configured to predict an abnormal level flag of the unmarked log and the extended log according to the consistency loss, and iterate the abnormal level training model set according to the cross entropy loss until the abnormal level training model set converges to obtain a log abnormality detection model;
the detection module 505 is configured to obtain a log to be detected of the system to be detected, input the log to be detected into the log anomaly detection model for detection, output an anomaly level corresponding to the log to be detected, and use the anomaly level corresponding to the log to be detected as an analysis result of the current system operation state.
Specifically, the obtaining module 501 is further configured to:
analyzing the unmarked log to obtain a plurality of log fields with different semantics;
screening key fields related to abnormal levels from the log fields according to preset semantic structure prior knowledge and the occurrence frequency of the log fields;
acquiring one or more synonymous fields corresponding to the key fields, and replacing the corresponding key fields with the synonymous fields;
and splicing the synonymous field and other log fields except the key field according to a random field processing strategy to obtain a plurality of corresponding expansion logs, wherein the random field processing strategy comprises replacing, deleting, inserting or exchanging the other log fields.
Specifically, the training module 502 further includes:
a constructing unit 5021, configured to uniformly adjust the lengths of the log data in the marked log, the unmarked log, and the extended log to preset lengths, and construct corresponding data vectors;
a feature extraction unit 5022, configured to determine a feature dimension of the data vector according to the length of the data vector, and perform semantic feature extraction on the data vector according to the feature dimension to obtain an initial semantic feature;
and a probability distribution generating unit 5023, configured to perform screening and combination of salient features on the initial semantic features to obtain final semantic features, and calculate and output probability distributions of abnormal levels of the marked logs, the unmarked logs and the extended logs according to the final semantic features.
Specifically, the calculating module 503 further includes:
a first calculating unit 5031, configured to calculate, according to the first probability distribution and preset abnormal level marks of the marked logs, a correct prediction probability of an abnormal level of each marked log;
a second calculating unit 5032, configured to calculate, according to preset model training parameters and the correct prediction probability, a cross entropy loss of the first probability distribution, so as to measure a difference between the abnormal level prediction of the labeled log by the classification model and a true abnormal level of the labeled log.
Specifically, the generating module 504 further includes:
an iteration unit 5041, configured to determine, according to the cross entropy loss, a correct prediction probability corresponding to each marked log; judging whether a correct prediction probability larger than a preset probability threshold exists or not; if so, deleting the first probability distribution corresponding to the correct prediction probability which is greater than the probability threshold value, and continuing to iterate the log anomaly detection model, otherwise, directly iterating the log anomaly detection model, and updating the model training parameters after the log anomaly detection model is iterated;
a model generating unit 5042, configured to calculate a sum of the cross entropy loss and the consistency loss to obtain a corresponding final loss value, and determine whether the final loss value is smaller than a preset final loss threshold; and if the final loss value is smaller than the final loss threshold value, the abnormal level training model set is converged and stops iteration to obtain a log abnormal detection model.
Specifically, the calculation formula of the correct prediction probability is as follows:
Figure BDA0002561847440000131
and is
Figure BDA0002561847440000132
Wherein, said etatIs a probability threshold value, thetFor a growth coefficient, K is the number of the abnormal grade categories, T is the current iteration number, and T is the preset total iteration number;
when the data amount in the marking log is smaller than a preset normal data amount range,
Figure BDA0002561847440000133
when the data amount in the marked log is larger than the normal data amount range, the data amount is recorded in the marked log
Figure BDA0002561847440000134
Specifically, the detecting module 505 further includes:
an obtaining unit 5051, configured to obtain a log to be detected of a system to be detected, where the log to be detected includes multiple pieces of log information, and the log information includes identification information of system operation management priority;
the detection unit 5052 is configured to input the log to be detected into the log abnormality detection model for detection, and predict an abnormality level of the log to be detected through the log abnormality detection model;
a screening unit 5053, configured to screen logs to be detected whose exception level is higher than a preset exception level threshold, and determine log information whose priority is higher than a preset priority threshold from the screened logs to be detected according to the identification information;
the analysis result generation unit 5054 is configured to highlight the log information with the priority greater than the preset priority threshold, and use an abnormal level corresponding to the highlighted log information and other log information except the highlighted log information as an analysis result of the current system operation state.
In the embodiment of the invention, marked logs and unmarked logs are obtained, and unmarked logs generated under the condition of system abnormity are expanded, so that the number of abnormal unmarked logs is increased under the condition of ensuring data difference, and the over-fitting resisting capability of the model is increased in the following training process of detecting the model; respectively inputting the marked logs, the unmarked logs and the expanded logs into three corresponding abnormal grade training models in the abnormal grade training model set for training so as to predict the probability distribution of each abnormal grade of the marked logs, the unmarked logs and the expanded logs; in the iterative training process of the log anomaly detection model, the first probability distribution of training is gradually deleted along with the increase of unmarked data, and the finally generated anomaly log detection model can be used for predicting the anomaly grade corresponding to the log to be detected generated in the system operation process to obtain the analysis result of the system operation state, so that the overfitting resistance intensity of the model is improved, and the detection difficulty of the detection model is reduced.
Fig. 5 and fig. 6 describe the system anomaly detection apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the device in the embodiment of the present invention is described in detail from the perspective of hardware processing.
Fig. 7 is a schematic structural diagram of a system anomaly detection device 700 according to an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 710 (e.g., one or more processors) and a memory 720, one or more storage media 730 (e.g., one or more mass storage devices) for storing applications 733 or data 732. Memory 720 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations for the system abnormality detection apparatus 700. Further, the processor 710 may be configured to communicate with the storage medium 730 to execute a series of instruction operations in the storage medium 730 on the system abnormality detection apparatus 700.
The system anomaly detection apparatus 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input-output interfaces 760, and/or one or more operating systems 731, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. It will be understood by those skilled in the art that the system abnormality detection apparatus configuration shown in fig. 7 does not constitute a limitation of the system abnormality detection apparatus, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the system anomaly detection method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A system anomaly detection method, characterized in that the system anomaly detection method comprises:
acquiring a marked log and a non-marked log of a system to be detected, and expanding the non-marked log to obtain an expanded log; wherein the marked log is: screening logs which are marked with abnormal grades from the unmarked logs;
respectively inputting the marked logs, the unmarked logs and the expanded logs into three identical abnormal grade training models for training, and correspondingly outputting a first probability distribution of each abnormal grade of the marked logs, a second probability distribution of each abnormal grade of the unmarked logs and a third probability distribution of each abnormal grade of the expanded logs, wherein the three identical abnormal grade training models form an abnormal grade training model set;
calculating a cross-entropy loss between the first probability distribution and a preset anomaly level of the marked log, and calculating a consistency loss between the second probability distribution and the third probability distribution;
predicting the abnormal grades of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
and acquiring a log to be detected of the current system, inputting the log to be detected into the log abnormity detection model for detection, outputting an abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as an analysis result of the running state of the current system.
2. The method according to claim 1, wherein the expanding the unmarked log to obtain an expanded log comprises:
analyzing the unmarked log to obtain a plurality of log fields with different semantics;
screening key fields related to abnormal levels from the log fields according to preset semantic structure prior knowledge and the occurrence frequency of the log fields;
acquiring one or more synonymous fields corresponding to the key fields, and replacing the corresponding key fields with the synonymous fields;
and splicing the synonymous field and other log fields except the key field according to a random field processing strategy to obtain a plurality of corresponding expansion logs, wherein the random field processing strategy comprises replacing, deleting, inserting or exchanging the other log fields.
3. The system abnormality detection method according to claim 1, wherein the training of the marked log, the unmarked log, and the expanded log by inputting the marked log, the unmarked log, and the expanded log into three identical abnormality level training models and outputting a first probability distribution of each abnormality level of the marked log, a second probability distribution of each abnormality level of the unmarked log, and a third probability distribution of each abnormality level of the expanded log in response thereto include:
uniformly adjusting the length of each log data in the marked log, the unmarked log and the expanded log to a preset length, and constructing a corresponding data vector;
determining the characteristic dimension of the data vector according to the length of the data vector, and extracting semantic features of the data vector according to the characteristic dimension to obtain initial semantic features;
and screening and combining the prominent features of the initial semantic features to obtain final semantic features, and calculating and outputting the probability distribution of the abnormal levels of the marked logs, the unmarked logs and the expanded logs according to the final semantic features.
4. The system anomaly detection method according to any one of claims 1-3, wherein said calculating a cross-entropy loss between said first probability distribution and a preset anomaly level of said labeled log comprises:
calculating the correct prediction probability of the abnormal grade of each marked log according to the first probability distribution and the preset abnormal grade mark of the marked log;
calculating cross entropy loss of the first probability distribution according to preset model training parameters and the correct prediction probability, so as to measure the difference between the abnormal grade prediction of the marked log and the real abnormal grade of the marked log.
5. The method according to claim 4, wherein the iterating the abnormal level training model set according to the cross entropy loss until the abnormal level training model set converges to obtain a log abnormal detection model comprises:
determining the correct prediction probability corresponding to each marked log according to the cross entropy loss, and judging whether the correct prediction probability larger than a preset probability threshold exists or not;
if so, deleting the first probability distribution corresponding to the correct prediction probability which is greater than the probability threshold value, and continuing to iterate the log anomaly detection model, otherwise, directly iterating the log anomaly detection model, and updating the model training parameters after the log anomaly detection model is iterated;
calculating the sum of the cross entropy loss and the consistency loss to obtain a corresponding final loss value, and judging whether the final loss value is smaller than a preset final loss threshold value or not;
and if the final loss value is smaller than the final loss threshold value, the abnormal level training model set is converged and stops iteration to obtain a log abnormal detection model.
6. The system anomaly detection method according to claim 5, wherein said correct prediction probability is calculated by the formula:
Figure FDA0003524642200000021
and is
Figure FDA0003524642200000022
Wherein, said etatIs a probability threshold value, thetFor a growth coefficient, K is the number of the abnormal grade categories, T is the current iteration number, and T is the preset total iteration number;
when the data amount in the marking log is smaller than a preset normal data amount range,
Figure FDA0003524642200000023
when the data amount in the marked log is larger than the normal data amount range, the data amount is recorded in the marked log
Figure FDA0003524642200000024
7. The system anomaly detection method according to claim 1, wherein the acquiring a log to be detected of a current system, inputting the log to be detected into the log anomaly detection model for detection, outputting an anomaly level corresponding to the log to be detected, and taking the anomaly level corresponding to the log to be detected as an analysis result of the current system operation state comprises:
acquiring a log to be detected of a current system, wherein the log to be detected comprises a plurality of pieces of log information, and the log information comprises identification information of system operation management priority;
inputting the log to be detected into the log abnormity detection model for detection, and predicting the abnormity grade of the log to be detected through the log abnormity detection model;
screening logs to be detected with abnormal grades higher than a preset abnormal grade threshold value, and determining log information with priority higher than a preset priority threshold value from the screened logs to be detected according to the identification information;
and highlighting the log information with the priority greater than a preset priority threshold value, and taking the abnormal grade corresponding to the highlighted log information and other log information except the highlighted log information as an analysis result of the current system running state.
8. A system abnormality detection device, characterized by comprising:
the acquisition module is used for acquiring a marked log and a non-marked log of the system to be detected and expanding the non-marked log to obtain an expanded log; wherein the marked log is: screening logs which are marked with abnormal grades from the unmarked logs;
a training module, configured to input the marked log, the unmarked log, and the extended log into three identical abnormal level training models for training, and output a first probability distribution of each abnormal level of the marked log, a second probability distribution of each abnormal level of the unmarked log, and a third probability distribution of each abnormal level of the extended log, where the three identical abnormal level training models form an abnormal level training model set;
a calculation module for calculating a cross-entropy loss between the first probability distribution and a preset anomaly level of the marked log, and for calculating a consistency loss between the second probability distribution and the third probability distribution;
the generation module is used for predicting the abnormal grades of the unmarked logs and the extended logs according to the consistency loss, and iterating the abnormal grade training model set according to the cross entropy loss until the abnormal grade training model set is converged to obtain a log abnormal detection model;
and the detection module is used for acquiring the log to be detected of the current system, inputting the log to be detected into the log abnormity detection model for detection, outputting the abnormity grade corresponding to the log to be detected, and taking the abnormity grade corresponding to the log to be detected as the analysis result of the running state of the current system.
9. A system abnormality detection apparatus characterized by comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the system anomaly detection device to perform the system anomaly detection method of any one of claims 1-7.
10. A computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the system anomaly detection method according to any one of claims 1-7.
CN202010611178.5A 2020-06-30 2020-06-30 System abnormality detection method, device, equipment and storage medium Active CN111782472B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010611178.5A CN111782472B (en) 2020-06-30 2020-06-30 System abnormality detection method, device, equipment and storage medium
PCT/CN2020/118218 WO2021139235A1 (en) 2020-06-30 2020-09-28 Method and apparatus for system exception testing, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010611178.5A CN111782472B (en) 2020-06-30 2020-06-30 System abnormality detection method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111782472A CN111782472A (en) 2020-10-16
CN111782472B true CN111782472B (en) 2022-04-26

Family

ID=72760356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010611178.5A Active CN111782472B (en) 2020-06-30 2020-06-30 System abnormality detection method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111782472B (en)
WO (1) WO2021139235A1 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308455B (en) * 2020-11-20 2024-04-09 深圳前海微众银行股份有限公司 Root cause positioning method, root cause positioning device, root cause positioning equipment and computer storage medium
CN112446335A (en) * 2020-12-02 2021-03-05 电子科技大学中山学院 Terahertz contraband detection method based on deep learning
CN112883193A (en) * 2021-02-25 2021-06-01 中国平安人寿保险股份有限公司 Training method, device and equipment of text classification model and readable medium
CN113347033B (en) * 2021-05-31 2022-05-27 中国工商银行股份有限公司 Root cause positioning method and system based on block chain and verification node
CN113256434B (en) * 2021-06-08 2021-11-23 平安科技(深圳)有限公司 Method, device, equipment and storage medium for recognizing vehicle insurance claim settlement behaviors
CN113297051B (en) * 2021-07-26 2022-03-04 云智慧(北京)科技有限公司 Log analysis processing method and device
CN113672870A (en) * 2021-08-20 2021-11-19 中国南方电网有限责任公司超高压输电公司柳州局 Fault event probability estimation method, device, computer equipment and storage medium
CN114238965A (en) * 2021-11-17 2022-03-25 北京华清信安科技有限公司 Detection analysis method and system for malicious access
CN114297054B (en) * 2021-12-17 2023-06-30 北京交通大学 Software defect number prediction method based on subspace mixed sampling
CN114338129B (en) * 2021-12-24 2023-10-31 中汽创智科技有限公司 Message anomaly detection method, device, equipment and medium
CN114881112A (en) * 2022-03-31 2022-08-09 北京优特捷信息技术有限公司 System anomaly detection method, device, equipment and medium
CN114706709B (en) * 2022-06-01 2022-08-23 成都运荔枝科技有限公司 Saas service exception handling method and device and readable storage medium
CN115146718A (en) * 2022-06-27 2022-10-04 北京华能新锐控制技术有限公司 Depth representation-based wind turbine generator anomaly detection method
CN115099676A (en) * 2022-07-14 2022-09-23 华能罗源发电有限责任公司 Method for detecting state quantity of bus in GIS of thermal power energy storage system
CN115174251B (en) * 2022-07-19 2023-09-05 深信服科技股份有限公司 False alarm identification method and device for safety alarm and storage medium
CN115168154B (en) * 2022-07-26 2023-06-23 北京优特捷信息技术有限公司 Abnormal log detection method, device and equipment based on dynamic baseline
CN115499159A (en) * 2022-08-09 2022-12-20 重庆长安汽车股份有限公司 CAN signal abnormality detection method, device, vehicle and storage medium
CN115883346B (en) * 2023-02-23 2023-05-23 广州嘉为科技有限公司 Abnormality detection method and device based on FDEP log and storage medium
CN116070206B (en) * 2023-03-28 2023-06-30 上海观安信息技术股份有限公司 Abnormal behavior detection method, system, electronic equipment and storage medium
CN116863638B (en) * 2023-06-01 2024-02-23 国药集团重庆医药设计院有限公司 Personnel abnormal behavior detection method and security system based on active early warning
CN116405326B (en) * 2023-06-07 2023-10-20 厦门瞳景智能科技有限公司 Information security management method and system based on block chain
CN116911852B (en) * 2023-07-21 2024-01-26 广州嘉磊元新信息科技有限公司 RPA user dynamic information monitoring method and system
CN117149846A (en) * 2023-08-16 2023-12-01 湖北中恒电测科技有限公司 Power data analysis method and system based on data fusion
CN117271350A (en) * 2023-09-28 2023-12-22 江苏天好富兴数据技术有限公司 Software quality assessment system and method based on log analysis
CN117290380B (en) * 2023-11-14 2024-02-06 华青融天(北京)软件股份有限公司 Abnormal dimension data generation method, device, equipment and computer readable medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389701A (en) * 2013-07-15 2013-11-13 浙江大学 Plant-level process fault detection and diagnosis method based on distributed data model
CN106951353A (en) * 2017-03-20 2017-07-14 北京搜狐新媒体信息技术有限公司 Work data method for detecting abnormality and device
CN107463455A (en) * 2017-08-01 2017-12-12 联想(北京)有限公司 A kind of method and device for detecting memory failure

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7457985B2 (en) * 2005-09-09 2008-11-25 International Business Machines Corporation Method to detect errors in computer systems by using state tracking
CN101360023A (en) * 2008-09-09 2009-02-04 成都市华为赛门铁克科技有限公司 Exception detection method, apparatus and system
CN102821002B (en) * 2011-06-09 2015-08-26 中国移动通信集团河南有限公司信阳分公司 Network flow abnormal detecting method and system
JP2014032516A (en) * 2012-08-02 2014-02-20 Fujitsu Ltd Storage device, controller, and data protection method
CN108090615B (en) * 2017-12-21 2021-10-08 东南大学溧阳研究院 Minimum frequency prediction method after power system fault based on cross entropy integrated learning
CN109284606B (en) * 2018-09-04 2019-08-27 中国人民解放军陆军工程大学 Data flow anomaly detection system based on empirical features and convolutional neural networks
CN109343990A (en) * 2018-09-25 2019-02-15 江苏润和软件股份有限公司 A kind of cloud computing system method for detecting abnormality based on deep learning
CN110365648A (en) * 2019-06-14 2019-10-22 东南大学 A kind of vehicle-mounted CAN bus method for detecting abnormality based on decision tree

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389701A (en) * 2013-07-15 2013-11-13 浙江大学 Plant-level process fault detection and diagnosis method based on distributed data model
CN106951353A (en) * 2017-03-20 2017-07-14 北京搜狐新媒体信息技术有限公司 Work data method for detecting abnormality and device
CN107463455A (en) * 2017-08-01 2017-12-12 联想(北京)有限公司 A kind of method and device for detecting memory failure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
安全审计与基于审计的入侵检测;张相锋;《中国博士学位论文全文数据库信息科技辑》;20041215;第I139-11 *

Also Published As

Publication number Publication date
CN111782472A (en) 2020-10-16
WO2021139235A1 (en) 2021-07-15

Similar Documents

Publication Publication Date Title
CN111782472B (en) System abnormality detection method, device, equipment and storage medium
CN111178456B (en) Abnormal index detection method and device, computer equipment and storage medium
CN105677791B (en) For analyzing the method and system of the operation data of wind power generating set
CN112910859B (en) Internet of things equipment monitoring and early warning method based on C5.0 decision tree and time sequence analysis
EP1958034B1 (en) Use of sequential clustering for instance selection in machine condition monitoring
CN112906764B (en) Communication safety equipment intelligent diagnosis method and system based on improved BP neural network
CN112685324A (en) Method and system for generating test scheme
CN112579414A (en) Log abnormity detection method and device
Xie et al. Logm: Log analysis for multiple components of hadoop platform
CN113221960A (en) Construction method and collection method of high-quality vulnerability data collection model
Chu et al. Co-training based on semi-supervised ensemble classification approach for multi-label data stream
Rücker et al. FlexParser—The adaptive log file parser for continuous results in a changing world
CN117370548A (en) User behavior risk identification method, device, electronic equipment and medium
CN111949459A (en) Hard disk failure prediction method and system based on transfer learning and active learning
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN113891342A (en) Base station inspection method and device, electronic equipment and storage medium
CN116861924A (en) Project risk early warning method and system based on artificial intelligence
CN116167370A (en) Log space-time characteristic analysis-based distributed system anomaly detection method
CN116599743A (en) 4A abnormal detour detection method and device, electronic equipment and storage medium
CN111352820A (en) Method, equipment and device for predicting and monitoring running state of high-performance application
CN113076217B (en) Disk fault prediction method based on domestic platform
CN115686995A (en) Data monitoring processing method and device
CN109978038B (en) Cluster abnormity judgment method and device
Gaykar et al. A Hybrid Supervised Learning Approach for Detection and Mitigation of Job Failure with Virtual Machines in Distributed Environments.
Du et al. Unstructured log oriented fault diagnosis for operation and maintenance management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40031385

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant