CN117034099A

CN117034099A - System log abnormality detection method

Info

Publication number: CN117034099A
Application number: CN202310384406.3A
Authority: CN
Inventors: 廖丽平; 朱柯; 罗建桢; 蔡君
Original assignee: Guangdong Polytechnic Normal University
Current assignee: Guangdong Polytechnic Normal University
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2023-11-10

Abstract

The application relates to the technical field of artificial intelligence, in particular to a system log abnormality detection method. Based on the pre-trained BERT model, the abnormal detection model can accurately acquire text semantic information of different log events, so that the generalization capability of the abnormal detection model is improved, and the interpretability of the model is enhanced. The spatial structural features and the time sequence relation features of the system behaviors are respectively analyzed by adopting the graph neural network and the MLP model, so that the potential features of the system behaviors are further found by the anomaly detection model on the basis of perception semantics, and the accuracy of anomaly detection is improved. Based on the anomaly detection method, feature mapping is carried out on the log semantics through an encoder. The state of the log sequence is characterized by adopting special characters [ SIGN ], the log sequence is subjected to anomaly detection by combining the adaptive space boundary division and the sequence reconstruction objective function, and simultaneously, the semantic features and the sequence modes of the log sequence are characterized, so that the detection result is more accurate.

Description

System log abnormality detection method

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a system log abnormality detection method.

Background

With the rapid development of communication technology and artificial intelligence and the explosive growth of the number of intelligent interconnection devices, the computer system is larger and more complex in scale and function. Anomaly detection has become an important task in constructing reliable computer systems. Once an abnormality occurs in a large-scale system, the availability and reliability of the system may be affected, and even the system is directly interrupted and paralyzed. Malicious system attacks can pose immeasurable hazards to society. Therefore, a system administrator must continuously collect real-time monitoring data, monitor the running state of the system, and timely discover the abnormal behavior of the system by adopting an abnormality detection technology, so as to ensure the safe and reliable running of the large-scale system.

Log analysis is one of the main techniques for current system anomaly detection. The system log contains semantically rich event content and one of the most important data sources for anomaly detection and system monitoring, as well as system runtime status. When a fault occurs, engineers can effectively troubleshoot and locate the fault by looking at the system log. However, as the size and complexity of the system increases, a large number of logs are typically generated at run-time, e.g., about the amount of logs in GB per hour for a commercial system. The massive amount of log content makes manual detection of abnormal events impractical. Second, log data is unstructured in nature and context-dependent, and a set of log data can be concatenated by the same ID and generate a log sequence to track their status and critical events. Associating log data by ID is a typical method of separating interleaved logs. The dependency relationship among log data is mined, so that the accuracy of abnormality positioning can be improved to a great extent. Finally, log forms also change in the whole software system life cycle, and abnormal behaviors are more and more difficult to mine. In short, the logs have the characteristics of large volume, unstructured, context-dependent, diversified forms and the like, so that the detection of the abnormal logs is very challenging. Therefore, an accurate and efficient log anomaly detection method is of great importance.

To date, log-based anomaly detection methods can be divided into three categories: (1) a pattern recognition method based on log count vectors: taking the log event count vector as input, taking each event as a dimension individually, mapping to vector space by using a traditional machine learning method, and classifying the deviation from the normal log sequence direction as abnormal. However, these methods do not consider the order pattern and content information of the log sequence. (2) a pattern learning method based on log event index: a sliding window is used over the log event sequence and the next event index is predicted based on the observation window. However, these methods do not consider log content information and cannot learn the dependency relationship between events. And a fixed sliding window size cannot capture sequential patterns outside the range of the sliding window. (3) a log semantic based representation learning method: event templates are extracted from the log content and log semantic representations are generated for each template using natural language processing (Natural Language Processing, NLP) techniques. However, they use only semantic representations of each log text content, lacking consideration of the relevance between events in the log sequence. In addition, it is not difficult to find out in the real log data, when an abnormality occurs, not only the sequence of events in the log sequence changes, but also the time difference between adjacent events changes, and the time information is also crucial for log analysis.

Disclosure of Invention

In order to solve the technical problems, the application provides a system log abnormality detection method, which solves the problem of inaccurate abnormality detection in the system log in the prior art by constructing a plurality of models.

In order to achieve the above purpose, the technical scheme adopted by the embodiment of the application is as follows:

in a first aspect, a system log anomaly detection method is applied to a server, and the method includes: the method comprises the steps of collecting real-time log data, and performing feature processing on the real-time log data to obtain semantic features, spatial features and time sequence features corresponding to the real-time log data; constructing the semantic features into event semantic sequences in a sequence mode, encoding the event semantic sequences into log semantic feature mapping based on an encoder, and fusing the spatial features and the time sequence features to obtain space-time correlation features; the space-time correlation feature and the log semantic feature are mapped and input into a decoder, and a semantic representation vector and a sequence mode map of a special word are output through the decoder; and mapping the semantic representation vector and the sequence pattern as input to a trained abnormal model to obtain a result output, determining whether the system log has daily life or not based on the result output, wherein the result output is 0 or 1, and when the result output is 0, the system log is normal, and when the result output is 1, the system log is abnormal.

Further, performing feature processing on the real-time log data to obtain semantic features corresponding to the real-time log data, where the feature processing includes: analyzing log events containing the real-time log data based on Drain, obtaining a log template, extracting context semantic representations corresponding to the events in the log template based on a trained BERT model, and capturing semantic features in the context semantic representations based on a Self-attribute mechanism.

Further, the log template includes a plurality of marks, the marks include [ CLS ] marks and [ SEP ] marks disposed at two ends of a start position and an end position of an event, and extracting context semantic representations corresponding to the event in the log template based on the trained BERT model includes: semantic information of a special word [ CLS ] in the event is extracted.

Further, capturing semantic features in the context semantic representation based on a Self-intent mechanism is obtained based on the following formula:wherein Q, K, V are respectively matrices of queries, keys and values, d _k Is a dimension coefficient.

Further, performing feature processing on the real-time log data to obtain spatial features corresponding to the real-time log data, where the feature processing includes: and converting the log event sequence containing the real-time log data into a system function path diagram, and extracting the spatial characteristics of the system function path diagram based on a diagram convolution network for the system function path diagram.

Further, extracting the spatial features of the system function path graph based on the graph rolling network comprises: obtaining node semantic features and node degree matrixes of the graph, carrying out average value pooling operation based on a graph convolution network to obtain feature representations of corresponding nodes, and processing based on the following formula:wherein m and n are respectively node semantic features and node degree matrixes of the graph, GCNConv (·) is a graph convolution network, pool (·) is an average value pooling operation, and a spatial structure feature of the graph is obtained> Representing node v _i Is represented by d, d represents a feature dimension.

Further, performing feature processing on the real-time log data to obtain a time sequence feature corresponding to the real-time log data, including: performing feature extraction on a log event sequence containing the real-time log data based on an MLP model, and specifically performing processing based on the following formula:f _t =sigmod (MLP (Δt)); wherein Δt is the time difference sequence of the log event sequence, Δt _i For the ith difference value in the time difference sequence, wij and bj are respectively the weight parameter and bias of the jth hidden layer to obtain the time relation feature of the log event sequence> Represents the i-th time difference deltat _i Is represented by d, d represents a feature dimension.

Further, a self-focusing module is arranged in the encoder, the self-focusing module is connected with a fully-connected feedforward neural network, the feedforward neural network is composed of two linear transformation functions, a ReLU activation function is arranged in the middle of the feedforward neural network, and the ReLU activation function is expressed by the following formula: FFN (x) =relu (xW) ₁ +b ₁ )W ₂ The method comprises the steps of carrying out a first treatment on the surface of the Where x represents the semantic representation, W, of the output from the attention module ₁ 、W ₂ 、b ₁ 、b ₂ Two layers of weight parameters and bias for FFN, respectively.

Further, the decoder is a Multi-head Attention module, each layer of the decoder comprises an Encoder-Decoder Attention module, and a feedforward neural network connected with the Encoder-Decoder Attention module is further arranged at the last layer of the decoder; the input of the Encoder-Decoder Attention module is the spatiotemporal correlation feature and the log semantic feature map.

Further, the sequence reconstruction loss function of the anomaly model is as follows: l=λl _boundary +(1-λ)L _re The method comprises the steps of carrying out a first treatment on the surface of the Where λ is the trade-off adaptive boundary errorAnd reconstruction error->Balanced weight parameters; wherein L is _boundary The method comprises the following steps:wherein->Wherein->Feature character [ SIGN ] representing the ith log event sequence]C represents a special word [ SIGN ] in the training set S]Is the center of all decoder outputs of (2), mean () is the calculated Mean.

In a second aspect, there is provided an electronic device including a memory storing a computer program and a processor implementing the abnormality detection method of any one of the above when executing the computer program.

In a third aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of the above.

According to the technical scheme provided by the embodiment of the application, unstructured log information is converted into structured log knowledge, so that the efficiency and the accuracy of log analysis are improved, and more flexible and efficient log analysis and application service are provided for users. Based on the pre-trained BERT model, the abnormal detection model can accurately acquire text semantic information of different log events, so that the generalization capability of the abnormal detection model is improved, and the interpretability of the model is enhanced. The spatial structural features and the time sequence relation features of the system behaviors are respectively analyzed by adopting the graph neural network and the MLP model, so that the potential features of the system behaviors are further found by the anomaly detection model on the basis of perception semantics, and the accuracy of anomaly detection is improved. According to the anomaly detection method based on the Self-Attention Encoder-Decoder Transformer model, characteristic mapping is carried out on log semantics through an encoder. The method has the advantages that the space-time correlation characteristics of the system behaviors are fused in the encoder to perform pattern learning on the log sequence, the state of the log sequence is represented by adopting special characters [ SIGN ], the log sequence is subjected to anomaly detection by combining the adaptive space boundary division and sequence reconstruction objective function, and meanwhile, the semantic characteristics and the sequence patterns of the log sequence are represented, so that the detection result is more accurate.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

The methods, systems, and/or programs in the accompanying drawings will be described further in terms of exemplary embodiments. These exemplary embodiments will be described in detail with reference to the drawings. These exemplary embodiments are non-limiting exemplary embodiments, wherein the exemplary numbers represent like mechanisms throughout the various views of the drawings.

Fig. 1 is a schematic diagram of a system log anomaly detection system according to an embodiment of the present application.

Fig. 2 is a flowchart of a system log anomaly detection method according to an embodiment of the present application.

Fig. 3 is a schematic diagram of a system log anomaly detection model according to an embodiment of the present application.

Detailed Description

In order to better understand the above technical solutions, the following detailed description of the technical solutions of the present application is made by using the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and the embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limiting the technical solutions of the present application, and the technical features of the embodiments and the embodiments of the present application may be combined with each other without conflict.

In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. It will be apparent, however, to one skilled in the art that the application can be practiced without these details. In other instances, well known methods, procedures, systems, components, and/or circuits have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present application.

The present application uses a flowchart to illustrate the execution of a system according to an embodiment of the present application. It should be clearly understood that the execution of the flowcharts may be performed out of order. Rather, these implementations may be performed in reverse order or concurrently. Additionally, at least one other execution may be added to the flowchart. One or more of the executions may be deleted from the flowchart.

Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, and the terms and terminology involved in the embodiments of the present application will be used in the following explanation.

(1) In response to a condition or state that is used to represent the condition or state upon which the performed operation depends, the performed operation or operations may be in real-time or with a set delay when the condition or state upon which it depends is satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.

(2) Based on the conditions or states that are used to represent the operations that are being performed, one or more of the operations that are being performed may be in real-time or with a set delay when the conditions or states that are being relied upon are satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.

(3) Neural networks are a class of feedforward neural networks that involve convolution calculations and have a deep structure. Convolutional neural networks are proposed by biological Receptive Field (fielded) mechanisms. Convolutional neural networks are dedicated to neural networks that process data having a grid-like structure. For example, time-series data (which may be regarded as a one-dimensional grid formed by regularly sampling on a time axis) and image data (which may be regarded as a two-dimensional grid of pixels), the convolutional neural network employed in the present embodiment processes, with respect to the image data, that is, a graph neural network (GCN).

The technical scheme provided by the embodiment of the application is mainly used for detecting and identifying the abnormality in the system log, the method is used for constructing an abnormal log knowledge graph (Anomaly Log Knowledge Graph, ALKG) based on the system log, integrating massive log data, modeling the log relationship, facilitating the positioning of the abnormal log and capturing semantic information and sequential modes in a log event sequence. Adopting a pre-trained BERT model to encode the semantics of the log template; converting the sequence mode into a system function path diagram, taking log event semantics as node characteristics, and representing the spatial characteristics of a log sequence through a graph rolling network (Graph Convolution Network, GCN) by combining the spatial structure of the diagram; the time differences between adjacent log events are characterized as time series characteristics by a multi-layer perceptron (Multilayer Perceptron, MLP), and anomaly detection is performed by a Self-Attention Encoder-Decoder Transformer model. Based on semantic information, the method can better identify the abnormality in the log event sequence by fusing the log sequence order mode and the time information

Against the above technical background, the present embodiment provides a system log anomaly detection method, which specifically includes the following steps:

and S210, acquiring real-time log data, and performing feature processing on the real-time log data to obtain semantic features, spatial features and time sequence features corresponding to the real-time log data.

In this embodiment, feature extraction is performed on the collected real-time log data, and the collected features are semantic features, spatial features, and time sequence features, respectively.

The acquisition of the semantic features specifically includes analyzing log events containing the real-time log data based on Drain, obtaining a log template, extracting context semantic representations corresponding to the events in the log template based on a trained BERT model, and capturing the semantic features in the context semantic representations based on a Self-attribute mechanism. Semantic information on a sequence of log events can have a significant impact on the performance of subsequent log anomaly detection. In order to avoid the influence of semantic confusion on the abnormal detection effect caused by the change of ambiguities and events, the present embodiment uses the BERT model to extract semantic information in log events, wherein, there are two methods for extracting features for BERT: fine tuning and feature extraction. In order to enable the subsequent anomaly detection model to accurately acquire text semantic information of different log events, in this embodiment, a fine tuning manner is adopted, and a BERT model is used as a base of the anomaly detection model to generate a context semantic representation corresponding to the event. We propose the following objective functions to obtain event semantic embedding, where for the objective functions are:

where p is the current word v predicted as v in the Mask Language Model (MLM) _i Is a probability of (2). Theta and theta ₁ The parameter set of the encoder for encoding subsequently performed in the BERT model and the parameter set in the output layer connected to the encoder in the MLM task are respectively, and V is the vocabulary size.

In the present embodiment, the log template includes a plurality of marks including [ CLS ] disposed at both ends of the start position and the end position of the event]Tags and [ SEP ]]The marking, based on the trained BERT model, the extracting the context semantic representation corresponding to the event in the log template comprises: extracting special words [ CLS ] in event]Semantic information of (a). Specifically, one [ CLS ] is added at each of the beginning and end of the sentence]And [ SEP ]]A marker, the starting position and the ending position of the marked sentence, denoted as e [ CLS [ = [ CLS ]]token ₁ ，token ₂ ，...，token _|e| [SEP]. In terms of extracting semantic information, we directly extract the special words [ CLS ] in the log event]Is characterized by the semantic information of (2) as an event, and is marked as sem ^[CLS] 。

For a given sequence of log events, l= { e ₁ ，e ₂ ，...，e _|l| Generating a corresponding event semantic sequence through a BERT model For event e _i Semantic information of (a).

In this embodiment, the semantic features are captured by Self-attribute mechanism for context semantic information, and are obtained based on the following formula:

wherein Q, K, V are respectively matrices of queries, keys and values, d _k Is a dimension coefficient.

In this embodiment, the spatial features are obtained based on the following method by converting a log event sequence containing the real-time log data into a system function path diagram, and extracting the spatial features of the system function path diagram based on a graph convolution network for the system function path diagram. The sequence mode among the log events is usually an important basis for judging whether the log events are normal or abnormal for the log sequence. The sequential pattern between log events reflects the location structure of the log events in the sequence. Unlike the sequence data, the position of each node in the graph is fixed, so that the node sequence mode does not exist in the graph. And generating structural sensing codes for the graph based on a graph rolling network (GCN) by utilizing a degree matrix of the graph, enhancing the interpretability of the graph representation, and extracting the spatial characteristics of the graph. In addition, the input degree and the output degree of the nodes intuitively reflect the local topological structure of the nodes, not only represent the importance of the nodes in the graph, but also reflect the similarity among the nodes, and can be used for supplementing the semantic similarity among the nodes.

Specifically, extracting the spatial features of the system function path diagram aiming at the diagram convolution network comprises the following steps: obtaining node semantic features and node degree matrixes of the graph, carrying out average value pooling operation based on a graph convolution network to obtain feature representations of corresponding nodes, and processing based on the following formula:

wherein m and n are respectively node semantic features and node degree matrixes of the graph, GCNConv (·) is a graph convolution network, pool (·) is an average value pooling operation, and a spatial structure feature of the graph is obtained Representing node v _i Is represented by d, d represents a feature dimension.

In this embodiment, feature extraction is performed based on an MLP model for extracting the sequence feature corresponding to the real-time log data. In this embodiment, the time stamp in the log event is also a potential anomaly criterion, and by analyzing the time stamp in the event sequence, as shown in fig. 3, the time difference for the anomaly event sequence fluctuates greatly. May be caused by system performance problems, such as: network congestion or system anomalies. If performance problems occur, the log events typically maintain the same sequence order as the normal sequence. Thus, system anomalies typically manifest themselves as a change in sequence order or a large fluctuation in time difference. For such performance problems, static feature extraction methods are typically employed to find performance defects, or intrusion detection is employed to detect them. On one hand, the complete extraction of sequence features is challenging, and on the other hand, the system load is increased, and the system operation efficiency is affected. Thus, in this embodiment by extracting the timing relationship features of the log events, in combination with the graph space structure features, possible anomaly issues are identified from the dimensions of space and timing. In this embodiment, the MLP model is used to extract the time-series relationship features of the event sequence:

f _t ＝Sigmod(MLP(ΔT))；

wherein Δt is the time difference sequence of the log event sequence, Δt _i Is the i-th difference value in the time difference sequence, w _ij And b _j Respectively obtaining the time relation characteristic of the log event sequence for the weight parameter and the bias of the j-th hidden layer Represent the firsti time differences Δt _i Is represented by d, d represents a feature dimension.

S220, constructing the semantic features into event semantic sequences according to a sequence mode, encoding the event semantic sequences into log semantic feature mapping based on an encoder, and fusing the spatial features and the time sequence features to obtain space-time correlation features.

In this embodiment, a log event sequence anomaly prediction decoder based on global semantic perception and system behavior analysis is provided. And taking the adaptive abnormal boundary and sequence reconstruction as training targets. Unlike encoders, to improve semantic perceptual performance, a Multi-head Attention module is used, where the function of the Multi-head Attention module is expressed as:

MultiHead(Q,K,V)＝Concat(head ₁ ,head ₂ ,...,head _h )W ^O ，

wherein the method comprises the steps ofAnd->For each attention head _i Is used for the linear projection weight of (a). W (W) ^O Is a projection matrix.

In this embodiment, each layer of decoder further includes an Encoder-Decoder Attention module and an FFN module. The input of the decoder is similar to the encoder, representing the sequence X in the event semantics _e On the basis of (1) adding a special word [ SIGN ] to the starting position]Semantic features for capturing whole time series to form complete decoder input X _d Wherein X is _d The method comprises the following steps:

X _d ＝[sem ^[SIGN] ]||[X _e ] _dim ＝1，[]||[] _dim =1 denotes the concatenation of two matrices in the column dimension.

In this embodiment, the spatio-temporal correlation feature is fused with the semantic feature map generated by the semantic perceptual Encoder to serve as an Encoder-decoderKey and Value inputs of the er Attention module. By doing so, the model can capture the event sequence patterns and global semantic dependencies represented in the node topology through the attention mechanism:

wherein the method comprises the steps ofObtaining a new event sequence characteristic representation F= { F ₁ ，F ₂ ，...，F _|l| }。

In this embodiment, the output of the Encoder-Decoder Attention module of the last layer decoder is followed by a fully connected feedforward neural network, i.e., FFN network, as the Encoder, and the final result is the complete output of the decoder as follows:

Y＝[Y ^[SIGN] ]||[Y _d ] _dim＝1 。

and S230, mapping the semantic representation vector and the sequence pattern as input to a trained abnormal model to obtain result output, and determining whether the system log has daily life or not based on the result output.

In this embodiment, the result output is 0 or 1, and indicates normal when the result output is 0, and indicates abnormal when the result output is 1.

In order to ensure that the model can learn the intrinsic differences between normal and abnormal log samples, the present embodiment proposes an adaptive boundary and sequence reconstruction loss function. Thus, the model is able to map normal log events to adjacent locations in implicit space, i.e., very close distances between normal samples, while also differentiating normal from abnormal samples. In the present embodiment, the added [ SIGN ]]The decoder of the tag outputs a semantic map characterizing the sequence of log events. Wherein the target for the adaptive boundary may contain a semantic mapping of all normal event sequences. In the test phase, any sequence that is modeled to be outside the boundary of the hyper-sphere is determined to be alienOften times. We propose an adaptive boundary loss functionThe method comprises the following steps:

wherein the method comprises the steps ofFeature character [ SIGN ] representing the ith log event sequence]Is represented by C, which represents a special word [ SIGN ] in the training set S]Is the center of all decoder outputs of (2), mean () is the calculated Mean.

Furthermore, the goal of sequence reconstruction is to learn the sequence patterns of normal log events, the encoder first encoding the input data in the low dimensional space, and then attempting to reconstruct them into the original form by the decoder. In this case, the input samples are considered abnormal if they are difficult to reconstruct, i.e. a large reconstruction error is generated. Loss function L for sequence reconstruction _re The method comprises the following steps:

wherein the method comprises the steps ofAnd->Respectively encoder outputs Y _e And decoder output Y _d Is the i-th event of (a).

In the present embodiment, the proposed adaptive boundary and sequence reconstruction loss function L:

L＝λL _boundary +(1-λ)L _re where λ is the trade-off adaptive boundary errorAnd reconstruction error->Balanced weight parameters.

Referring to fig. 1, the identification method provided in the embodiment of the present application further includes a system, where the system is a server, and a memory and a processor are disposed in the server, where the memory is used to store log data, and the processor is configured with the anomaly detection method in steps S210-S230.

The following describes each component of the processor in detail:

wherein in this embodiment the processor is a specific integrated circuit (application specific integrated circuit, ASIC), or one or more integrated circuits configured to implement embodiments of the present application, such as: one or more microprocessors (digital signal processor, DSPs), or one or more field programmable gate arrays (field programmable gate array, FPGAs).

Alternatively, the processor may perform various functions, such as performing the method shown in fig. 2 described above, by running or executing a software program stored in memory, and invoking data stored in memory.

In a particular implementation, the processor may include one or more microprocessors, as one embodiment.

The memory is configured to store a software program for executing the scheme of the present application, and the processor is used to control the execution of the software program, and the specific implementation manner may refer to the above method embodiment, which is not described herein again.

Alternatively, the memory may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, but may also be, without limitation, electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), compact disc read-only memory (compact disc read-only memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be integrated with the processor or may exist separately and be coupled to the processing unit through an interface circuit of the processor, which is not particularly limited by the embodiment of the present application.

It should be noted that the structure of the processor shown in this embodiment is not limited to the apparatus, and an actual apparatus may include more or less components than those shown in the drawings, or may combine some components, or may be different in arrangement of components.

In addition, the technical effects of the processor may refer to the technical effects of the method described in the foregoing method embodiments, which are not described herein.

It should be appreciated that the processor in embodiments of the application may be other general purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate arrays (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It should also be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example but not limitation, many forms of random access memory (random access memory, RAM) are available, such as Static RAM (SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DR RAM).

The above embodiments may be implemented in whole or in part by software, hardware (e.g., circuitry), firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.

In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A system log anomaly detection method, applied to a server, comprising:

the method comprises the steps of collecting real-time log data, and performing feature processing on the real-time log data to obtain semantic features, spatial features and time sequence features corresponding to the real-time log data;

constructing the semantic features into event semantic sequences in a sequence mode, encoding the event semantic sequences into log semantic feature mapping based on an encoder, and fusing the spatial features and the time sequence features to obtain space-time correlation features;

the space-time correlation feature and the log semantic feature are mapped and input into a decoder, and a semantic representation vector and a sequence mode map of a special word are output through the decoder;

and mapping the semantic representation vector and the sequence pattern as input to a trained abnormal model to obtain a result output, determining whether the system log has daily life or not based on the result output, wherein the result output is 0 or 1, and when the result output is 0, the system log is normal, and when the result output is 1, the system log is abnormal.

2. The system log anomaly detection method according to claim 1, wherein performing feature processing on the real-time log data to obtain semantic features corresponding to the real-time log data, comprises: analyzing log events containing the real-time log data based on Drain, obtaining a log template, extracting context semantic representations corresponding to the events in the log template based on a trained BERT model, and capturing semantic features in the context semantic representations based on a Self-attribute mechanism.

3. The system log anomaly detection method of claim 2, wherein the log template comprises a plurality of markers including [ CLS ] markers and [ SEP ] markers disposed at both ends of an event start location and an end location, and extracting the context semantic representation corresponding to the event in the log template based on the trained BERT model comprises: semantic information of a special word [ CLS ] in the event is extracted.

4. The system log anomaly detection method of claim 3, wherein capturing semantic features in the context semantic representation based on a Self-intent mechanism is obtained based on the following formula:

5. The system log anomaly detection method according to claim 1, wherein performing feature processing on the real-time log data to obtain spatial features corresponding to the real-time log data, comprises: and converting the log event sequence containing the real-time log data into a system function path diagram, and extracting the spatial characteristics of the system function path diagram based on a diagram convolution network for the system function path diagram.

6. The system log anomaly detection method of claim 5, wherein extracting spatial features of the system functional path graph based on a graph convolutional network comprises: obtaining node semantic features and node degree matrixes of the graph, carrying out average value pooling operation based on a graph convolution network to obtain feature representations of corresponding nodes, and processing based on the following formula:

7. The system log anomaly detection method according to claim 1, wherein performing feature processing on the real-time log data to obtain a timing feature corresponding to the real-time log data, comprises: performing feature extraction on a log event sequence containing the real-time log data based on an MLP model, and specifically performing processing based on the following formula:

f _t ＝Sig mod(MLP(ΔT))；

wherein Δt is the time difference sequence of the log event sequence, Δt _i Is the i-th difference value in the time difference sequence, w _ij And b _j Respectively obtaining the time relation characteristic of the log event sequence for the weight parameter and the bias of the j-th hidden layer Represents the i-th time difference deltat _i Is represented by d, d represents a feature dimension.

8. The system log anomaly detection method of claim 1, wherein a self-attention module is disposed in the encoder, the self-attention module is connected to a fully connected feedforward neural network, the feedforward neural network is composed of two linear transformation functions, and a ReLU activation function is disposed in the middle, and the ReLU activation function is represented by the following formula:

FFN(x)＝ReLU(xW ₁ +b ₁ )W ₂ ；

where x represents the semantic representation, W, of the output from the attention module ₁ 、W ₂ 、b ₁ 、b ₂ Two layers of weight parameters and bias for FFN, respectively.

9. The system log anomaly detection method according to claim 1, wherein the decoder is a Multi-head Attention module, each layer of the decoder comprises an Encoder-Decoder Attention module, and a feedforward neural network connected with the Encoder-Decoder Attention module is further arranged at the last layer of the decoder; the input of the Encoder-Decoder Attention module is the spatiotemporal correlation feature and the log semantic feature map.

10. The system log anomaly detection method of claim 9, wherein the sequence reconstruction loss function of the anomaly model is:

L＝λL _boundary +(1-λ)L _re ；

where λ is the trade-off adaptive boundary errorAnd reconstruction error->Balanced weight parameters;

wherein L is _boundary The method comprises the following steps:

wherein->

Wherein Y is _i ^[SIGN] Feature character [ SIGN ] representing the ith log event sequence]C represents a special word [ SIGN ] in the training set S]Is the center of all decoder outputs of (2), mean () is the calculated Mean.