CN114296975A - Distributed system call chain and log fusion anomaly detection method - Google Patents

Distributed system call chain and log fusion anomaly detection method Download PDF

Info

Publication number
CN114296975A
CN114296975A CN202111583157.8A CN202111583157A CN114296975A CN 114296975 A CN114296975 A CN 114296975A CN 202111583157 A CN202111583157 A CN 202111583157A CN 114296975 A CN114296975 A CN 114296975A
Authority
CN
China
Prior art keywords
event
call chain
log
graph
span
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111583157.8A
Other languages
Chinese (zh)
Inventor
彭鑫
张晨曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202111583157.8A priority Critical patent/CN114296975A/en
Publication of CN114296975A publication Critical patent/CN114296975A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention belongs to the technical field of software engineering and cloud computing, and particularly relates to a distributed system call chain and log fusion anomaly detection method. The method is based on a call chain and log data during the operation of a distributed system, a call chain event relation graph is constructed according to the call chain and the log data, a call chain event relation graph mode during the normal operation of the system is learned by using a graph neural network and a single-classification deep learning method, a newly generated call chain event relation graph is detected in real time during online use, and a call chain generating abnormal behaviors is identified; the method specifically comprises the following steps: analyzing log events, analyzing call chain events, vectorizing events, constructing a call chain event relation graph, training a graph neural network model and detecting online anomalies; the invention can help operation and maintenance personnel and developers to quickly find system abnormity, generate corresponding alarm information, accelerate the speed of fault location and on-line problem solution and reduce labor cost.

Description

Distributed system call chain and log fusion anomaly detection method
Technical Field
The invention belongs to the technical field of software engineering and cloud computing, and particularly relates to a distributed system call chain and log anomaly detection method.
Background
The distributed system decomposes the application program into a plurality of independent modules, each module is provided with a separate process and a running environment, and the processes are communicated with each other through a network. The micro-service architecture developed by a distributed system becomes an important component of a cloud native technology, and the micro-service can be independently developed, independently deployed and flexibly stretched based on fine-grained function division and a distributed operating environment, and most enterprises adopt the distributed or micro-service architecture to realize application.
The anomaly detection is an important component for monitoring during the operation of the system, and the rapid and accurate anomaly detection and discovery can help the system to rapidly discover problems and avoid serious consequences caused by fault propagation. When the single system is abnormal, developers can monitor and find the abnormality through logs or indexes. The distributed system has cross-process interaction, so that the traditional log anomaly detection technology has poor effect. Due to the occurrence of distributed tracking, operation and maintenance and developers can observe a cross-process interaction mode of the distributed system, but the distributed tracking system focuses on interaction behaviors among processes, and log and distributed tracking are difficult to effectively combine, so that the distributed system is extremely difficult to detect abnormity.
Disclosure of Invention
The invention aims to provide a distributed system call chain and log fusion anomaly detection method based on a graph neural network, which can quickly detect the abnormal operation behavior of a distributed system.
The method uses a call chain and log data of a distributed system during operation to construct a call chain event relation graph for describing the relation between system call and an operation log in the distributed system; according to the invention, historical data is collected, a single-classification anomaly detection model based on a graph neural network is trained, and after the single-classification anomaly detection model is deployed to an online system, an anomaly calling chain can be detected in real time, so that system problems can be quickly found.
The invention mainly comprises six parts: the method comprises the steps of log event analysis, call chain event analysis, event vectorization, call chain event relation graph construction, graph neural network model training and online anomaly detection. The method comprises the following specific steps:
(1) and analyzing the log events. The log event refers to a system event represented by a log statement printed during the running of the program. A log statement is usually composed of a fixed part (log template) and a variable part, this step parses the original log data and uses the log template to represent different log events. The method specifically comprises the following substeps:
1) and collecting the logs in the running process of the system through a distributed log collecting tool.
2) And analyzing the log data by using a log template analysis algorithm (Drain), and acquiring a log template corresponding to each log as an event description.
3) And extracting the traceID and the spanID corresponding to each log, and associating the traceID and the spanID with each log.
(2) Call chain event resolution. The call chain event refers to an event generated when the distributed system calls across processes, such as a client sending a synchronous call request, a server receiving the synchronous call request, a producer producing an asynchronous call message, a consumer consuming the asynchronous call message, and the like. This step parses the original call chain data and divides the original data into different types of call chain events. The method specifically comprises the following substeps:
1) analyzing each span data of the Client/Server type as a Request event and a Response event, and analyzing and obtaining four types of events of Client Request/Server Request/Client Response/Server Response by taking a span name of a span type (Client/Server) event (Request/Response) as an event description. And simultaneously recording the occurrence time of the event, wherein the occurrence time of the request event is the starting time of the span, and the occurrence event of the response event is the ending time of the span.
2) Analyzing each span data of the Producer/Consumer type into a Producer event and a Consumer event, and analyzing to obtain two event types of the Producer/Consumer by taking a "span type (Producer/Consumer span name)" as an event description. And simultaneously recording the occurrence time of the event, wherein the occurrence time of the Producer event is the starting time of the Producer type span, and the occurrence time of the Consumer event is the starting time of the Consumer type span.
(3) And (5) vectorizing the event. Event vectorization refers to mapping log events and call chain events to a vector space and representing the log events and the call chain events in vectors, so that a deep learning model can process the events and can reflect semantic information of the events. The method specifically comprises the following substeps:
1) and (4) preprocessing event description. And removing stop words and non-character symbols in the event description, and splitting the combined words.
2) Word embedding. Each word in the event description is mapped to the same vector space using a word embedding model and represented as a vector.
3) And embedding sentences. And calculating a vector corresponding to each event description through all word vectors in each event description. And performing weighted combination on the word vectors by using TF-IDF, wherein fewer words are generated and have higher weight, and a sentence vector is obtained.
(4) And constructing a call chain event relation graph. And associating the system call and the log printing behavior of the distributed system during the operation by using the call chain event relation diagram, wherein the purpose is to describe the operation behavior state of the distributed system. The raw data includes runtime system call chain data and runtime log data for each distributed software or service. The call chain event relation graph comprises two nodes, namely a log event node and a call chain event node; the relationships between nodes include a sequential relationship, a synchronous call relationship, a synchronous response relationship, and an asynchronous call relationship, corresponding to edges in the graph. A typical call chain event relationship diagram is shown, for example, in fig. 1. The construction of the call chain event relation graph comprises the following sub-steps:
1) the log events are linked. For each span in a call chain, all log events belonging to this span are retrieved. The log events are sorted according to the time stamp, and an edge of a sequential relationship is added between each log event and the next log event.
2) Insert span events. For each span in a call chain, all call chain events belonging to the span are fetched. And each call chain event is inserted into the log event sequence according to the occurrence time, and an edge with a sequence relation is added to the event adjacent to the call chain event.
3) And connecting span. For all spans in a call chain, the call chain event relationships are connected according to their parent span relationships. For the Client/Server type span, for the event of the ServerRequest type, the event is pointed to from the event of the ClientRequest type of the parent span and the edge of a synchronous call relation. For the event of the ServerResponse type, the event of the ClientResponse type of the parent span is pointed to from the event to the edge of the synchronous response relation. For the span of the Producer/Consumer type, for the event of the Consumer type, the event is pointed to from the event of the Producer type of the parent span and an edge of an asynchronous calling relation.
(5) And training an anomaly detection model based on the graph neural network. And inputting call chain event relational graph data generated during normal operation of the system into a graph neural network to train a single-classification anomaly detection model. The method uses Gated Graph Neural Network (GGNN) and deep support vector data description (DeepSVDD) to train an anomaly detection model; the method comprises the following steps: inputting training data into a gated graph neural network to obtain vector representation; obtaining a vector representation h of each call chain event relation graph by using soft attention mechanism calculation according to the node vector representationg(ii) a Describing and training an anomaly detection model by using depth support vector data; using the depth support vector data description to enable the neural network of the gated graph to learn effective graph vector representation, and enabling most of training data vector representations to be in the same hypersphere, so that the relation of normal call chain events is reflected correctly;
the training of the anomaly detection model specifically comprises the following substeps:
1) and inputting the training data into a gated graph neural network to obtain vector representation. And (3) sequentially processing each call chain event relation graph by using a gated graph neural network, and obtaining vector representation of each node in each call chain event relation graph in a vector space based on information propagation. Inputting a call chain event relation graph g ═ V, A, X, wherein V represents a node (event) set; a represents the adjacency matrix of the graph; x is formed by R|V|×dFor the node attribute matrix, each row xvAttributes (event vectors) representing the node v.
The specific calculation method is as follows:
Figure BDA0003427585360000031
Figure BDA0003427585360000032
Figure BDA0003427585360000033
wherein the content of the first and second substances,
Figure BDA0003427585360000034
vector representation for nodes through information propagation;
Figure BDA0003427585360000035
the corresponding rows and columns in the adjacency matrix for the outgoing and incoming edges of node v.
Obtaining a vector representation h of each call chain event relation graph by using soft attention mechanism calculation according to the node vector representationgThe specific calculation formula is as follows:
Figure BDA0003427585360000041
2) training the anomaly detection model using the deep support vector data description. And the gate control graph neural network is made to learn effective graph vector representation by using the deep support vector data description, and most of training data vector representations are in the same hypersphere, so that the relation of normal call chain events is correctly reflected. The specific loss function is as follows:
Figure BDA0003427585360000042
wherein c is the center of the hypersphere; r is the radius of the hypersphere; the hyperparameter μ controls the proportion of the call chain event graph in the training set that lies outside the hypersphere.
During training, Adam is used to optimize the gated graph neural network parameters. And finding the current optimal radius R by using linear search every k rounds, wherein the specific value is calculated from the (1-mu) quantiles from all samples in the current training sample to the center of the hypersphere.
(6) And (4) online anomaly detection. And (3) deploying the trained model in the system, and when a new call chain is generated, sequentially executing the steps (1) to (4) to obtain a corresponding call chain event relation graph, and inputting the graph into the anomaly detection model to obtain vector representation of the graph. And calculating the distance between the point and the hypersphere as an abnormal score by using the vector, wherein a specific calculation formula is as follows:
Figure BDA0003427585360000043
if the point is located within the hypersphere, ans (h)g)<0, the calling chain is considered as a normal calling chain. If the point is located outside the hypersphere, ans (h)g)>And 0, the calling chain is considered to be abnormal, and the system generates an alarm to remind operation and maintenance personnel. Meanwhile, based on the soft attention mechanism in the step (5), the system can visualize the abnormal call chain event graph and mark the nodes with high attention scores as dark colors.
The advantages and the characteristics of the invention are mainly as follows:
(1) the invention can simultaneously detect log abnormity and call chain abnormity, find system problems in real time for operation and maintenance personnel to refer, and accelerate the fault finding speed and range.
(2) The invention represents log events and call chain events in association and distributed systems by a unified graph, and can support various analysis technologies and applications.
(3) The invention trains the abnormal detection model by using an unsupervised method, only needs to monitor data when the system operates normally, does not depend on fault data, and has good generalization capability.
(4) The method can improve the accuracy of anomaly detection, and experiments based on the open source micro-service reference system TrainTicket show that the accuracy of the method reaches 93 percent, the recall rate reaches 97 percent, and the score is averagely improved by 0.37 compared with the F1 scores of other methods.
Drawings
FIG. 1 is a diagram illustrating call chain event relationships constructed in accordance with the present invention.
FIG. 2 is a flow chart of the method of the present invention.
Detailed Description
The Skywalk acquisition is used as an application performance monitoring platform, the PyTorch is used as a distributed system of a deep learning framework, and the method for detecting the fusion abnormality of the call chain and the log of the distributed system based on the graph neural network is further introduced.
For the training of the anomaly detection model, the specific process is as follows:
(1) call chains and log data are collected. And configuring a Skywalk agent for each program in the distributed system, and setting a call chain and a log collection rule. Call chain and log data generated by normal operation of the system are collected and stored in an elastic search as training data.
(2) And constructing a call chain event relation graph data set. And (3) processing the collected training data according to the sequence of the steps (1) to (4), constructing a corresponding call chain event relation graph for each call chain, and using all the processed data as a graph data set for a deep learning model.
(3) And training an anomaly detection model. The pytorech is used to realize the depth support vector data description model based on the gated graph neural network. The model was trained based on the graph dataset for 100 rounds, with 3 layers of gated graph neural networks, a learning rate of 0.0001, and μ ═ 0.05.
For online anomaly detection, the specific flow is as follows:
(1) an anomaly detection process is triggered. The exception detection process is triggered when SkyWalking gathers new call chain data.
(2) And constructing an event relation graph. And (4) processing the call chain and log data related to the call chain according to the sequence of the steps (1) to (4), and constructing a corresponding call chain event relation graph for the call chain.
(3) And (4) detecting the abnormality. And inputting the call chain event relation graph into an anomaly detection model, obtaining an anomaly score of the call chain event relation graph, outputting the call chain as an abnormal call chain if the anomaly score is greater than 0, and troubleshooting the fault by operation and maintenance personnel according to the output.
The common 14 different types of faults are injected into the open source micro service reference system TrainTicket and an abnormity detection comparison experiment is carried out, so that the score of the method is improved by 0.37 on average compared with the existing log abnormity detection or calling chain abnormity detection method F1.

Claims (6)

1. A distributed system call chain and log fusion anomaly detection method based on a graph neural network is characterized in that a call chain event relation graph is constructed by using call chain and log data during the operation of the distributed system, and is used for describing the relation between system call and operation logs in the distributed system; training a single-classification anomaly detection model based on a graph neural network by collecting historical data; deploying the trained anomaly detection model to an online system, namely detecting an anomaly calling chain in real time and quickly finding system problems; the method comprises the following specific steps:
(1) parsing log events
The log event refers to a system event represented by a log statement printed during program running; one log statement consists of a fixed part, namely a log template, and a variable part; analyzing the log events, namely analyzing original log data, and representing different log events by using a log template;
(2) parsing call chain events
The call chain event refers to an event generated when the distributed system is called in a cross-process mode, and comprises a synchronous call request sent by a client, a synchronous call request received by a server, an asynchronous call message generated by a producer and an asynchronous call message consumed by a consumer; analyzing the call chain event, namely analyzing original call chain data, and dividing the original data into different types of call chain events;
(3) event vectorization
The event vectorization is to map the log events and the call chain events to a vector space and represent the log events and the call chain events by vectors, so that a deep learning model can process the events and can reflect semantic information of the events;
(4) constructing a call chain event relationship graph
The method comprises the steps that a call chain event relation graph is used for correlating system call and log printing behaviors during the operation of the distributed system, and is used for describing the operation behavior state of the distributed system; the original data comprises runtime system call chain data and runtime log data of each distributed software or service; the call chain event relation graph comprises two nodes, namely a log event node and a call chain event node; the relationships among the nodes comprise a sequential relationship, a synchronous calling relationship, a synchronous response relationship and an asynchronous calling relationship, and correspond to edges in the graph;
(5) training graph neural network model
Inputting call chain event relational graph data generated during normal operation of the system into a graph neural network to train a single-classification anomaly detection model; the method comprises the following steps: inputting training data into a gated graph neural network to obtain vector representation; obtaining a vector representation h of each call chain event relation graph by using soft attention mechanism calculation according to the node vector representationg(ii) a Describing and training an anomaly detection model by using depth support vector data; using the depth support vector data description to enable the neural network of the gated graph to learn effective graph vector representation, and enabling most of training data vector representations to be in the same hypersphere, so that the relation of normal call chain events is reflected correctly; recording the center of the hypersphere as c and the radius as R;
(6) online anomaly detection
Deploying the trained model in a system, and when a new call chain is generated, sequentially executing the steps (1) to (4) to obtain a corresponding call chain event relation graph, and inputting the graph into an anomaly detection model to obtain vector representation of the graph; and calculating the distance between the point and the hypersphere as an abnormal score by using the vector, wherein a specific calculation formula is as follows:
Figure FDA0003427585350000021
if the point is located within the hypersphere, ans (h)g)<0, the calling chain is considered as a normal calling chain; if the point is located outside the hypersphere, ans (h)g)>0, then the call chain is consideredWhen the abnormity happens, the system can generate an alarm to remind the operation and maintenance personnel; and (5) simultaneously, based on the soft attention mechanism in the step (5), visualizing the abnormal call chain event graph by the system, and marking the nodes with high attention scores as dark colors.
2. The method for detecting the abnormal fusion of the call chain and the log of the distributed system based on the graph neural network as claimed in claim 1, wherein the analyzing the log event in the step (1) specifically comprises the following sub-steps:
1) collecting logs in the running process of the system through a distributed log collecting tool;
2) analyzing the log data by using a log template analysis algorithm, and acquiring a log template corresponding to each log as an event description;
3) and extracting the traceID and the spanID corresponding to each log, and associating the traceID and the spanID with each log.
3. The method for detecting the fusion anomaly of the call chain and the log of the distributed system based on the graph neural network as claimed in claim 2, wherein the step (2) of analyzing the call chain event specifically comprises the following substeps:
1) analyzing each span data of the Client/Server type as a Request event and a Response event, and analyzing to obtain four types of events of Client Request/Server Request/Client Response/Server Response by taking a span name (Request/Response) of the span type (Client/Server) as an event description; simultaneously recording the occurrence time of the event, wherein the occurrence time of the request event is the starting time of the span, and the occurrence event of the response event is the ending time of the span;
2) analyzing each span data of the Producer/Consumer type into a Producer event and a Consumer event, and analyzing to obtain two event types of the Producer/Consumer by taking a "span type (Producer/Consumer span name)" as an event description; and simultaneously recording the occurrence time of the event, wherein the occurrence time of the Producer event is the starting time of the Producer type span, and the occurrence time of the Consumer event is the starting time of the Consumer type span.
4. The method for detecting the fusion anomaly of the call chain and the log of the distributed system based on the graph neural network as claimed in claim 3, wherein the event vectorization in the step (3) specifically comprises the following sub-steps:
1) preprocessing the event description; removing stop words and non-character symbols in the event description, and splitting the combined words;
2) word embedding; mapping each word in the event description to the same vector space by using a word embedding model and representing the word in a vector;
3) sentence embedding; calculating a vector corresponding to each event description through all word vectors in each event description; and performing weighted combination on the word vectors by using TF-IDF, wherein fewer words are generated and have higher weight, and a sentence vector is obtained.
5. The method for detecting the fusion anomaly of the call chain and the log of the distributed system based on the graph neural network as claimed in claim 4, wherein the step (4) of constructing the call chain event relation graph specifically comprises the following sub-steps:
1) linking the log events; for each span in a call chain, acquiring all log events belonging to the span; sequencing the log events according to the time stamps, and adding an edge with a sequence relation between each log event and the next log event;
2) inserting a span event; for each span in a call chain, acquiring all call chain events belonging to the span; inserting each call chain event into a log event sequence according to the occurrence time, and adding an edge with a sequence relation with the adjacent events;
3) connecting span; for all the spans in one calling chain, connecting calling chain event relations according to the parent span relations of the spans; for a Client/Server type span, for a ServerRequest type event in the span, connecting an edge of a synchronous call relationship with the ClientRequest type event of a parent span of the parent span to point to the event; for the event of the ServerResponse type, the event is connected with an edge of a synchronous response relation and points to the event of the ClientResponse type of the parent span of the event; for the span of the Producer/Consumer type, for the event of the Consumer type, the event is pointed to from the event of the Producer type of the parent span and an edge of an asynchronous calling relation.
6. The method for detecting the fusion abnormality of the call chain and the log of the distributed system based on the graph neural network as claimed in claim 5, wherein the training graph neural network model in the step (5) specifically comprises the following sub-steps:
1) inputting training data into a gated graph neural network to obtain vector representation; sequentially processing each call chain event relational graph by using a gated graph neural network, and obtaining vector representation of each node in each call chain event relational graph in a vector space based on information propagation; inputting a call chain event relation graph g ═ V, A, X, wherein V represents a node, namely a set of events; a represents the adjacency matrix of the graph; x is formed by R|V|×dFor the node attribute matrix, each row xvThe attribute representing the node v is specifically calculated as follows:
Figure FDA0003427585350000031
Figure FDA0003427585350000032
Figure FDA0003427585350000033
wherein the content of the first and second substances,
Figure FDA0003427585350000041
vector representation for nodes through information propagation;
Figure FDA0003427585350000042
for the egress of node vThe edges and the incoming edges are adjacent to corresponding rows and columns in the matrix;
obtaining a vector representation h of each call chain event relation graph by using soft attention mechanism calculation according to the node vector representationgThe specific calculation formula is as follows:
Figure FDA0003427585350000043
2) describing and training an anomaly detection model by using depth support vector data; using the depth support vector data description to enable the neural network of the gated graph to learn effective graph vector representation, and enabling most of training data vector representations to be in the same hypersphere, so that the relation of normal call chain events is reflected correctly; the specific loss function is as follows:
Figure FDA0003427585350000044
wherein c is the center of the hypersphere; r is the radius of the hypersphere; the vector of the calling chain event relation graph in the hyper-parameter mu control training set represents the proportion outside the hyper-sphere;
during training, optimizing parameters of a neural network of a gated graph by using Adam; and finding the current optimal radius R by using linear search every k rounds, wherein the specific value is calculated from the (1-mu) quantiles from all samples in the current training sample to the center of the hypersphere.
CN202111583157.8A 2021-12-22 2021-12-22 Distributed system call chain and log fusion anomaly detection method Pending CN114296975A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111583157.8A CN114296975A (en) 2021-12-22 2021-12-22 Distributed system call chain and log fusion anomaly detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111583157.8A CN114296975A (en) 2021-12-22 2021-12-22 Distributed system call chain and log fusion anomaly detection method

Publications (1)

Publication Number Publication Date
CN114296975A true CN114296975A (en) 2022-04-08

Family

ID=80969318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111583157.8A Pending CN114296975A (en) 2021-12-22 2021-12-22 Distributed system call chain and log fusion anomaly detection method

Country Status (1)

Country Link
CN (1) CN114296975A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114647465A (en) * 2022-05-23 2022-06-21 南京航空航天大学 Single program splitting method and system for multi-channel attention-chart neural network clustering
CN114721860A (en) * 2022-05-23 2022-07-08 北京航空航天大学 Micro-service system fault positioning method based on graph neural network
CN117349740A (en) * 2023-11-01 2024-01-05 上海鼎茂信息技术有限公司 Micro-service architecture-oriented exception detection algorithm for fusing log and call chain data
WO2024027384A1 (en) * 2022-08-03 2024-02-08 华为技术有限公司 Fault detection method, apparatus, electronic device, and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114647465A (en) * 2022-05-23 2022-06-21 南京航空航天大学 Single program splitting method and system for multi-channel attention-chart neural network clustering
CN114721860A (en) * 2022-05-23 2022-07-08 北京航空航天大学 Micro-service system fault positioning method based on graph neural network
CN114647465B (en) * 2022-05-23 2022-08-16 南京航空航天大学 Single program splitting method and system for multi-channel attention map neural network clustering
WO2024027384A1 (en) * 2022-08-03 2024-02-08 华为技术有限公司 Fault detection method, apparatus, electronic device, and storage medium
CN117349740A (en) * 2023-11-01 2024-01-05 上海鼎茂信息技术有限公司 Micro-service architecture-oriented exception detection algorithm for fusing log and call chain data

Similar Documents

Publication Publication Date Title
CN114296975A (en) Distributed system call chain and log fusion anomaly detection method
Liu et al. Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks
WO2022007108A1 (en) Deep learning-based network alarm positioning method
CN112468347B (en) Security management method and device for cloud platform, electronic equipment and storage medium
CN111460728A (en) Method and device for predicting residual life of industrial equipment, storage medium and equipment
CN114785666B (en) Network troubleshooting method and system
CN110011990A (en) Intranet security threatens intelligent analysis method
CN107111609A (en) Lexical analyzer for neural language performance identifying system
CN112217674A (en) Alarm root cause identification method based on causal network mining and graph attention network
CN116225760A (en) Real-time root cause analysis method based on operation and maintenance knowledge graph
WO2022053163A1 (en) Distributed trace anomaly detection with self-attention based deep learning
CN115237717A (en) Micro-service abnormity detection method and system
CN111126437A (en) Abnormal group detection method based on weighted dynamic network representation learning
CN115858277A (en) Distributed system call chain anomaly detection method based on graph neural network and PU learning
Casalino et al. Incremental and adaptive fuzzy clustering for virtual learning environments data analysis
CN116955604A (en) Training method, detection method and device of log detection model
Yu et al. Self-supervised log parsing using semantic contribution difference
CN112882899B (en) Log abnormality detection method and device
CN114416423A (en) Root cause positioning method and system based on machine learning
Yu et al. Dram failure prediction in large-scale data centers
CN106227790A (en) A kind of method using Apache Spark classification and parsing massive logs
CN112039907A (en) Automatic testing method and system based on Internet of things terminal evaluation platform
CN115658546A (en) Software fault prediction method and system based on heterogeneous information network
CN114465875B (en) Fault processing method and device
JI et al. Log Anomaly Detection Through GPT-2 for Large Scale Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination