WO2021109874A1 - 拓扑图生成方法、异常检测方法、装置、设备及存储介质 - Google Patents

拓扑图生成方法、异常检测方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021109874A1
WO2021109874A1 PCT/CN2020/130033 CN2020130033W WO2021109874A1 WO 2021109874 A1 WO2021109874 A1 WO 2021109874A1 CN 2020130033 W CN2020130033 W CN 2020130033W WO 2021109874 A1 WO2021109874 A1 WO 2021109874A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
preset
dependent
pair
stream
Prior art date
Application number
PCT/CN2020/130033
Other languages
English (en)
French (fr)
Inventor
韩静
刘建伟
董辛酉
刘峥
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to US17/782,519 priority Critical patent/US11797360B2/en
Priority to EP20896927.9A priority patent/EP4071616A4/en
Publication of WO2021109874A1 publication Critical patent/WO2021109874A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0784Routing of error reports, e.g. with a specific transmission path or data flow
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/86Event-based monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/865Monitoring of software

Definitions

  • This application relates to the field of computer technology, for example, to a topology map generation method, anomaly detection method, device, equipment, and storage medium.
  • This application provides a topology map generation method, anomaly detection method, device, equipment, and storage medium.
  • An embodiment of the present application provides a method for generating a topology map, including: obtaining a preset event stream, wherein the preset event stream corresponds to a normal log execution path; determining dependent event pairs in the preset event stream; determining The transition interval range corresponding to the dependent event pair, wherein the transition interval represents the time difference between adjacent occurrences of two events in the dependent event pair; according to the transition probability corresponding to the dependent event pair and the transition interval range An event topology map is generated, wherein the transition probability represents a conditional probability between two events in the dependent event pair.
  • An embodiment of the present application provides an abnormality detection method, including: obtaining a flow of events to be detected, wherein the flow of events to be detected corresponds to a log execution path to be detected; and comparing the flow of events to be detected with an event topology graph , Wherein the event topology map is generated using the topology map generation method provided in the embodiment of the present application; according to the comparison result, it is determined whether there is an abnormality in the event stream to be detected.
  • An embodiment of the present application provides a topology map generation device, including: a preset event flow acquisition module configured to acquire a preset event flow, wherein the preset event flow corresponds to a normal log execution path; an event-dependent determination module , Set to determine the dependent event pair in the preset event stream; the transfer interval range determining module, set to determine the transfer interval range corresponding to the dependent event pair, wherein the transfer interval represents the dependent event pair in the dependent event pair The time difference between the occurrence times of the two events; the topology map generation module is configured to generate an event topology map according to the transition probability corresponding to the dependent event pair and the transition interval range, wherein the transition probability represents the transition probability of the dependent event pair Conditional probability between two events.
  • the embodiment of the present application provides an abnormality detection device, including: a to-be-detected event stream acquisition module, configured to acquire the to-be-detected event stream, wherein the to-be-detected event stream corresponds to the log execution path to be detected; a comparison module, which is configured to In order to compare the flow of events to be detected with the event topology diagram, the event topology diagram is generated using the topology diagram generation method provided in the embodiment of the present application; the anomaly detection module is configured to determine the to-be-detected event flow according to the comparison result. Check whether there is an abnormality in the event stream.
  • An embodiment of the present application provides a computer device, including: a processor and a memory; the processor is configured to execute a program stored in the memory, using any of the methods in the embodiments of the present application.
  • the embodiment of the present application provides a storage medium that stores a computer program, and when the computer program is executed by a processor, any one of the methods in the embodiments of the present application is implemented.
  • FIG. 1 is a schematic flowchart of a method for generating a topology map provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of yet another method for generating a topology map according to an embodiment of the application
  • FIG. 3 is a schematic flowchart of an abnormality detection method provided by an embodiment of the application.
  • FIG. 4 is a schematic flowchart of yet another anomaly detection method provided by an embodiment of the application.
  • FIG. 5 is a structural block diagram of a topology diagram generating apparatus provided by an embodiment of the application.
  • FIG. 6 is a structural block diagram of an abnormality detection device provided by an embodiment of the application.
  • FIG. 7 is a structural block diagram of a computer device provided by an embodiment of this application.
  • Figure 1 is a schematic flow chart of a method for generating a topology map provided by an embodiment of the application.
  • the method can be executed by a topology map generation device, where the device can be implemented by software and/or hardware, and generally can be integrated in a computer device.
  • the method includes the following steps.
  • Step 101 Obtain a preset event stream, where the preset event stream corresponds to a normal log execution path.
  • the system log is unstructured.
  • the log structure is often different, and the software system can perform multiple tasks in parallel, resulting in the output system logs are often staggered, and the system operators are in the log operation and maintenance
  • the system log data is intricate and large, so it is often difficult to achieve satisfactory accuracy.
  • There is a workflow-based solution in the related technology but it lacks a solution that can accurately dig out a workflow diagram representing the normal execution process of the system from a large amount of intricate log information.
  • a preset event stream (also called a preset transaction stream) corresponding to a normal log execution path is first obtained, and then analysis and mining are performed based on the preset event stream, and an event topology for log abnormality analysis is finally generated
  • the diagram can accurately represent the normal execution process of the system.
  • a preset log stream (equivalent to an original log stream) corresponding to a normal log execution path may be obtained first, and then the preset log stream is converted into a corresponding preset event stream according to the log template.
  • the system log can be divided into two parts: fixed part and variable.
  • the fixed part is the fixed part of the original log entry and does not change with the change of the system state.
  • the variable part will change with the change of the system state.
  • the log template is abstracted from the original log.
  • the variable part can be used as a placeholder *
  • each log template can correspond to a log output statement as much as possible, that is, each log template corresponds to an event, or in other words, corresponds to an event type.
  • Two or more logs in the preset log stream may correspond to the same log template, that is, the same event may occur more than twice in the preset log stream.
  • the event mentioned in the embodiments of the present application can be considered as an event type, such as event A, and each occurrence of event A in the preset event stream can be considered as an instance of event A.
  • the log template may be pre-configured, or the log in the unstructured preset log stream may be parsed into a structured log template through log analysis.
  • Step 102 Determine dependent event pairs in the preset event stream.
  • An event pair is the basic unit that constitutes an event flow graph.
  • the characteristic of the dependency relationship between two events is the time relationship between the two events.
  • the event pair (A, B) means that event A occurs Event B often occurs afterwards.
  • a dependent event pair can be regarded as an event pair that satisfies the set dependency relationship.
  • the previous event in the dependent event pair can be called the predecessor event, and the subsequent event can be called the successor event.
  • the dependent event pair in the preset event stream can be determined according to the conditional probabilities of the two events.
  • the determined dependent event pair may be added to the dependent event pair set.
  • Step 103 Determine a transfer interval range corresponding to the dependent event pair, where the transfer interval represents a time difference between adjacent occurrences of two events in the dependent event pair.
  • the transfer interval that is, the transfer time interval
  • the transfer time interval can effectively reflect the volatility of the system event transfer time
  • the time difference between each dependent event in the preset event stream can be analyzed to determine the transfer interval range.
  • the transition interval range may represent the normal range within which the time difference between two events in the dependent event pair occurs adjacently, and the method for determining the transition interval range is not limited.
  • Step 104 Generate an event topology map according to the transition probability and transition interval range corresponding to the dependent event pair, where the transition probability represents a conditional probability between two events in the dependent event pair.
  • an event topology graph (Event Topology Graph, ETG) of a set structure may be generated according to the two dimensions of the transition probability and the transition interval range corresponding to the dependent event pair.
  • the set structure may be a tree structure, for example.
  • the event topology graph includes multiple nodes, the nodes in the multiple nodes represent events in the dependent event pair, and the connection relationship between two nodes in the multiple nodes Contains the transition probability and transition interval range corresponding to the dependent event pairs represented by the two nodes. That is, in the event topology graph, the nodes can be events in the dependent event pair, and the transition probability and the transition interval range corresponding to the dependent event pair represented by the two nodes can be marked on the connecting line between the two nodes.
  • the generated event topology diagram contains standard information on the two dimensions of the conditional probability of the event occurrence and the time interval of the occurrence of the normal log event flow. When used for log anomaly detection, it can be detected in these two dimensions. , Improve the accuracy of detection.
  • the topology map generation method obtaineds the preset event stream corresponding to the normal log execution path, determines the dependent event pair in the preset event stream, and determines the transition interval range corresponding to the dependent event pair, according to the An event topology map is generated depending on the transition probability and transition interval range corresponding to the event pair, where the transition probability represents the conditional probability between two events.
  • the generated event topology diagram contains standard information in the two dimensions of the conditional probability of the event occurrence and the time interval of the occurrence of the normal log event flow. When used for log anomaly detection, it can be used here. The detection is carried out in two dimensions to improve the accuracy of detection.
  • the generating an event topology map according to the transition probability and the transition interval range corresponding to the dependent event pair includes: taking the events included in the dependent event pair as nodes, and taking the dependent event pair
  • the corresponding transition probability is the weight of the edge between nodes to generate a maximum spanning tree; the transition interval range corresponding to the dependent event pair is added to the edge in the maximum spanning tree to obtain an event topology graph.
  • the spanning tree of a graph is a subgraph containing all nodes, usually expressed as a tree, and the maximum spanning tree is the spanning tree with the largest weight of the weighted graph.
  • the maximum transition probability between paths can be the objective function to generate the maximum spanning tree, and the corresponding transition interval range is added to the edge of the maximum spanning tree, that is, for the edge between two nodes, Mark the transition interval range corresponding to the dependent event pair corresponding to the two nodes on the edge.
  • the obtaining a preset event stream includes: obtaining a preset log stream, wherein the preset log stream corresponds to a normal log execution path; and using a preset log analysis algorithm to analyze the preset event stream.
  • the logs in the log stream are parsed to obtain multiple log templates, where each log template corresponds to an event; according to the multiple log templates, the preset log stream is converted into a preset log stream corresponding to the preset log stream.
  • the advantage of this setting is that the log template is generated based on the preset log stream, so that the log template matches the log in the preset log stream better, and the corresponding event stream can be obtained more accurately.
  • the preset log analysis algorithm may be, for example, Basic Signature Generation (BSG), LKE, and Iterative Partitioning Log Mining (IPLoM), etc.
  • the preset log stream can be expressed as ⁇ l 1 ,l 2 ,...l n >, assuming as shown in Table 1:
  • each log template represents a type of event, such as the one in Table 1.
  • Table 2 The log template obtained after log parsing of the preset log stream is shown in Table 2:
  • Event_id represents the corresponding event type.
  • the transaction flows corresponding to the preset log streams l 1 , l 2 , l 3 , l 4 , l 5 , and l 6 are e 1 , e 2 , e 2 , e 3 , e 4 , and e 4 .
  • the determining the dependent event pair in the preset event stream includes: for each event appearing in the preset event stream, determining a set of candidate successor events corresponding to the current event, and Determine whether the current event and each candidate successor event in the set of candidate successor events corresponding to the current event satisfy a first preset dependency relationship, and determine the candidate successor events that satisfy the first preset dependency relationship It is a successor event and adds the successor event to the successor event set, where the current event and a successor event form a dependent event pair.
  • the advantage of this setting is that for each event, each event can be used as a predecessor event, and the corresponding candidate successor event set is initially determined, and then it is judged whether the event in the candidate successor event set can become the corresponding successor event. The efficiency of determining dependent event pairs can be improved.
  • the determining the set of candidate successor events corresponding to the current event includes: adding the first event that exists between every two adjacent occurrences of the current event in the preset event stream to the initial event.
  • Candidate successor event set includes: calculating the conditional probability of the current event and each first event; removing the second event from the initial candidate successor event set to obtain the candidate successor event set corresponding to the current event, where all The conditional probability of the current event and the second event is less than a preset conditional probability threshold.
  • the conditional probability can also be called the correlation probability
  • the preset conditional probability threshold can be set according to the actual situation. The advantage of this setting is that the conditional probability can be used to filter out noise events, remove the indirect subsequent events of the precursor event as much as possible, retain the direct successor of the precursor event, and improve the accuracy of determining the dependent event pair.
  • the following formula may be used to calculate the conditional probability of the current event and each first event:
  • A represents the current event, that is, the predecessor event
  • B represents the first event in the set of initial candidate successor events corresponding to A
  • p A is the probability of event A
  • p B is the probability of event B
  • B) is the event The number of occurrences of the initial candidate subsequent event set corresponding to event A of B, that is, the number of occurrences of the first event between every two adjacent occurrences of the current event.
  • the judging whether the current event and each candidate subsequent event in the set of candidate subsequent events corresponding to the current event satisfy a first preset dependency relationship includes: For each candidate subsequent event in the set of candidate subsequent events corresponding to the current event, calculate the unconditional distribution of the waiting time of the current candidate subsequent event, calculate the conditional distribution of the waiting time of the current candidate subsequent event relative to the current event, according to The unconditional distribution and the conditional distribution determine whether a first preset dependency relationship is satisfied between the current event and the current candidate successor event, wherein the waiting time represents the occurrence time of the current event and the current The time difference between the occurrence time of the candidate successor event.
  • the advantage of this setting is that the dependent event pair can be determined more accurately.
  • S A range of between [0, T], a given time point z, the minimum distance between the positive z and S A waiting period of time is d (z, S A) min
  • , x ⁇ S A , x ⁇ z, the unconditional distribution of the waiting time of event B is F B (r) P(d(z,S B )) ⁇ r, where r is the threshold parameter of the time interval, where z corresponds to Any event.
  • a (r) P (d (z, S B)) ⁇ r, z ⁇ S A, x is any point in the sequence S A, where z corresponding to the arbitrary point S a, F B
  • a describes the conditional probability of any event a is a time point x.
  • the determining whether the current event and the current candidate successor event satisfy a first preset dependency relationship according to the unconditional distribution and the conditional distribution includes: In a case where the conditional distribution conforms to a normal distribution, it is determined that the current event and the current candidate successor event satisfy a first preset dependency relationship.
  • the advantage of this setting is that it can quickly and accurately determine whether two events are dependent event pairs.
  • the determining the transition interval range corresponding to the dependent event pair includes: for each dependent event pair, obtaining a time difference sequence corresponding to the current dependent event pair in the preset event stream, and The time difference sequence is clustered, and the transition interval range corresponding to the current dependent event pair is determined according to the time distribution in the cluster, wherein the time difference sequence includes that two events in the current dependent event pair occur adjacently Time difference.
  • the first time difference between the two occurrences is calculated and becomes the time difference sequence
  • the second time difference between the two occurrences is calculated and becomes the second element in the time difference sequence, and so on, the time difference sequence corresponding to the dependent event pair is obtained.
  • the determining the transition interval range corresponding to the current dependent event pair according to the time distribution in the cluster category includes: determining the current dependent event pair corresponding to the current dependent event pair according to the maximum value and the minimum value in the cluster category. Or, determine the transition interval range corresponding to the current dependent event pair according to the confidence interval of the time distribution in the cluster class.
  • a sequence of transition intervals between pairs of dependent events is obtained. For example, for each dependency to tap event on ⁇ T i, T j>, preset event stream to find all adjacent T j and T i of, recording the time between all adjacent T j and T i of the difference Is the sequence ⁇ t 1 ,t 2 ,...,t m >.
  • the clustering algorithm is used for the time difference sequence, and the redundant events can be removed by the clustering method.
  • the clustering algorithms that can be used include AGglomerative NESting (AGNES) and Divisive ANAlysis (DIANA) , And a density-based clustering method with noise (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), etc., take the maximum and minimum values of each cluster as the time interval range of event pairs belonging to the cluster.
  • AGNES AGglomerative NESting
  • DIANA Divisive ANAlysis
  • DBSCAN Density-Based Spatial Clustering of Applications with Noise
  • the determining the transition interval range corresponding to the current dependent event pair according to the time distribution in the cluster category includes: using a preset statistical test method to test the current dependent event pair, if the test If passed, the transfer interval range corresponding to the current dependent event pair is determined according to the time distribution in the cluster class.
  • the advantage of this setting is that the dependency relationship of the dependent event pair can be verified in time.
  • the preset statistical test methods can include chi-square test, z-test, t-test, and so on. Exemplarily, each cluster is tested with a chi-square test. If each cluster passes the chi-square test, it proves that there is a dependency relationship between the event pairs, and the time interval calculated in the preceding steps is used as the transition time of the event pair Interval range.
  • the method further includes: for two nodes corresponding to any dependent event pair in the maximum spanning tree, calculating that there is a detour path for the current two nodes Probability, if the detour probability is greater than the preset detour probability threshold, then the edge between the current two nodes is complemented.
  • d(E 1 ,E 2 ) log 1+path(E 1 ,E 2 )), which is determined by setting an appropriate preset detour probability threshold according to the log sequence Whether to fill in the path from E 1 to E 2.
  • the complementing the edges between the current two nodes includes: taking the sum of the weights of all edges traversed by the target detour path between the current two nodes as the The weights of the edges between the current two nodes are used for edge completion, where the target detour path is the path with the largest sum of the weights of all the edges passed.
  • FIG. 2 is a schematic flow chart of another method for generating a topology map according to an embodiment of the application. As shown in FIG. 2, the method includes the following steps.
  • Step 201 Obtain a preset log stream, and convert the preset log stream into a corresponding preset event stream.
  • Step 202 For each event that appears in the preset event stream, determine the set of candidate subsequent events corresponding to the current event, and determine whether the current event and each candidate subsequent event in the set of candidate subsequent events corresponding to the current event are satisfied The first preset dependency relationship, and the dependent event pair is determined according to the judgment result.
  • the first event that exists between every two adjacent occurrences of the current event in the preset event stream is added to the initial candidate successor event set, and the conditional probability of the current event and each first event is calculated, and the conditional probability is less than
  • the second event with the preset conditional probability threshold is removed from the initial candidate subsequent event set, and the candidate subsequent event set corresponding to the current event is obtained.
  • the unconditional distribution of the waiting time of the current candidate subsequent event is calculated, and the conditional distribution of the waiting time of the current candidate subsequent event relative to the current event is calculated, In the case that the unconditional distribution and the conditional distribution conform to the normal distribution, it is determined that the current event and the current candidate successor event satisfy the first preset dependency relationship, that is, the current event and the current candidate successor event are a dependent event pair .
  • Step 203 For each dependent event pair, obtain the time difference sequence corresponding to the current dependent event pair in the preset event stream, cluster the time difference sequence, and determine the current dependent event pair corresponding to the current dependent event pair according to the maximum and minimum values in the cluster class. Transfer interval range.
  • the time difference series After clustering the time difference series, it can also include using the chi-square test to test the cluster corresponding to the current dependent event pair. If the test passes, determine the current dependent event pair corresponding to the current dependent event pair according to the maximum and minimum values in the cluster corresponding to the current dependent event The range of the transfer interval. Optionally, if the test fails, it can be considered that the current dependent event pair does not have a dependency relationship in time, and the current dependent event pair is deleted from the dependent event pair set.
  • Step 204 Use the events included in the dependent event pair as nodes, and use the transition probability corresponding to the dependent event pair as the weight of the edge between the nodes to generate a maximum spanning tree.
  • the node in the tree represents an event
  • the weight on the edge represents the transition probability between the connected predecessor event and the successor event
  • the spanning tree serves as the skeleton of the entire workflow.
  • the available spanning tree algorithms are Prim and Kruskal. These algorithms can be modified, such as the maximum transition probability between paths as the objective function to generate the maximum spanning tree.
  • Step 205 For the two nodes corresponding to any dependent event pair in the maximum spanning tree, calculate the detour probability of the current two nodes with detour paths, if the detour probability is greater than the preset detour probability threshold, then the current two The sum of the weights of all edges passed by the target detour path between nodes is used as the weight of the edge between the current two nodes for edge completion.
  • Step 206 Add a corresponding transition interval range to the edge in the maximum spanning tree that has undergone edge completion processing to obtain an event topology diagram for log anomaly detection.
  • the topology map generation method obtained by the embodiment of the application obtains the preset log flow corresponding to the normal log execution path, converts the preset log flow into a preset event flow, mines the dependent event pairs in the preset event flow, and passes The clustering method determines the transition interval range corresponding to the dependent event pair.
  • the events contained in the dependent event pair are used as nodes, and the transition probability corresponding to the dependent event pairs is used as the weight of the edges between nodes to generate the maximum spanning tree.
  • the corresponding transition interval range is added to the edge of the tree to obtain the event topology diagram for log anomaly detection.
  • the generated tree-like event topology diagram contains standard information in the two dimensions of the conditional probability of event occurrence and the time interval of occurrence of the normal log event stream.
  • log anomaly detection it can Improve detection accuracy and improve detection efficiency.
  • FIG. 3 is a schematic flowchart of an abnormality detection method provided by an embodiment of the application.
  • the method can be executed by an abnormality detection device, where the device can be implemented by software and/or hardware, and generally can be integrated in a computer device. As shown in Figure 3, the method includes the following steps.
  • Step 301 Obtain an event stream to be detected, where the event stream to be detected corresponds to the log execution path to be detected.
  • the event stream to be detected may be a conversion of a log stream newly generated by the system, or may be a conversion of a log stream generated in history that requires anomaly detection.
  • the log template used when generating the event topology map can be used to convert the log stream to be detected into the corresponding event stream to be detected.
  • Step 302 Compare the event flow to be detected with the event topology graph.
  • the event topology map is generated using the topology map generation method provided in the embodiment of the present application.
  • the generation process of the event topology diagram can be regarded as the offline stage of anomaly detection. After a high-quality event topology diagram is generated, the event topology diagram can be used to represent the normal execution path of the system. In the online phase, the flow of events to be detected is compared with the event topology diagram. To analyze the anomaly.
  • Step 303 Determine whether there is an abnormality in the event stream to be detected according to the comparison result.
  • the anomaly detection method provided by the embodiment of the application compares the event flow to be detected corresponding to the log execution path to be detected with the event topology map generated by the topology map generation method provided in the embodiment of the application, and according to the comparison result It can quickly and accurately detect whether there is an abnormality in the event stream to be detected, which can improve the accuracy and efficiency of log abnormality detection.
  • the event topology graph includes a plurality of nodes, the nodes in the plurality of nodes represent events in the dependent event pair, and the relationship between two nodes in the plurality of nodes
  • the connection relationship includes the transition probability and transition interval range corresponding to the dependent event pairs represented by the two nodes. Comparing the event flow to be detected with the event topology map, and determining whether the event flow to be detected is abnormal according to the comparison result, includes: for the current event in the event flow to be detected, in the event topology map Find the corresponding target event in the current event; in the case that the next event of the current event does not correspond to the child node of the target event, it is determined that there is an abnormality in the event stream to be detected.
  • the target event can be understood as an event of the same type as the current event that exists in the event topology diagram. If the next event of the current event is the same as the event type on any child node of the target event, it is considered that the next event of the current event corresponds to the child node of the target event.
  • a child node of a node can be understood as a node connected to and behind the node. Taking the tree structure as an example, the child nodes of a node are branch nodes of the node. The advantage of this setting is that, based on the conditional probability of two consecutive events, it is possible to verify whether there is an abnormality in the sequence of events in the event stream to be detected, so as to further quickly and accurately perform anomaly detection.
  • the method further includes: in a case where the next event of the current event corresponds to the first child node of the target event, obtaining the difference between the current event and the next event Obtain the transfer interval range corresponding to the target event and the first child node; if the first time interval is not within the transfer interval range, determine the flow of events to be detected There is an exception.
  • the advantage of this setting is that after the verification based on the conditional probability of two consecutive events is passed, it is detected whether the time interval between the occurrence of the two events is within a reasonable range, thereby improving the accuracy of anomaly detection.
  • Figure 4 is a schematic flow diagram of an anomaly detection method provided by an embodiment of the application.
  • the transaction streams e1, e2,...en are read, that is, the events to be detected are obtained flow.
  • For the current event ei determine whether ei exists in the event topology diagram. If ei is not in the event topology diagram, re-judge the next event in the transaction flow as the new current event. If ei is in the event topology diagram, continue Determine whether the next event ei+1 of ei is in the child node of ei in the event topology graph.
  • the output execution path is abnormal; if ei+1 is in the child node of ei in the event topology graph, continue to judge the difference between ei and ei+1 Whether the interval is within the specified time interval, that is, it is judged whether the interval between ei and ei+1 is within the range of the corresponding transfer interval. If the interval between ei and ei+1 is not within the specified time interval, the output execution path is abnormal; if the interval between ei and ei+1 is within the specified time interval, it is judged whether there are other events in the transaction flow. If there are still other events, ei+1 is used as the new ei to repeat the judgment; if there are no other events, the process ends.
  • FIG. 5 is a structural block diagram of a topology diagram generating device provided by an embodiment of the application.
  • the device can be implemented by software and/or hardware, and generally can be integrated in a computer device, and an event topology diagram can be generated by executing a topology diagram generation method. As shown in FIG.
  • the device includes: a preset event stream obtaining module 501, configured to obtain a preset event stream, wherein the preset event stream corresponds to a normal log execution path; an event-dependent pair determining module 502 is set To determine the dependent event pair in the preset event stream; the transfer interval range determining module 503 is configured to determine the transfer interval range corresponding to the dependent event pair, wherein the transfer interval represents two of the dependent event pairs The time difference between the occurrence time of each event; the topology diagram generation module 504 is configured to generate an event topology diagram according to the transition probability and transition interval range corresponding to the dependent event pair, wherein the transition probability represents two of the dependent event pairs Conditional probability between events.
  • a preset event stream obtaining module 501 configured to obtain a preset event stream, wherein the preset event stream corresponds to a normal log execution path
  • an event-dependent pair determining module 502 is set To determine the dependent event pair in the preset event stream
  • the transfer interval range determining module 503 is configured to determine the transfer interval range
  • the topology map generating device obtains the preset event flow corresponding to the normal log execution path, determines the dependent event pair in the preset event flow, and determines the transition interval range corresponding to the dependent event pair, according to the dependency
  • Event pairs corresponding to transition probabilities and transition interval ranges generate an event topology diagram, where the transition probability represents a conditional probability between two events, and the transition probability represents a conditional probability between two events.
  • the generated event topology diagram contains standard information in the two dimensions of the conditional probability of the event occurrence and the time interval of the occurrence of the normal log event flow. When used for log anomaly detection, it can be used here. The detection is carried out in two dimensions to improve the accuracy of detection.
  • the event topology graph includes a plurality of nodes, the nodes in the plurality of nodes represent events in the dependent event pair, and the relationship between two nodes in the plurality of nodes
  • the connection relationship includes the transition probability and transition interval range corresponding to the dependent event pairs represented by the two nodes.
  • the generating an event topology diagram according to the transition probability and the transition interval range corresponding to the dependent event pair includes: taking the events included in the dependent event pair as nodes, and taking the dependent event pair
  • the corresponding transition probability is the weight of the edge between nodes to generate a maximum spanning tree; the transition interval range corresponding to the dependent event is added to the edge in the maximum spanning tree to obtain an event topology graph.
  • the obtaining a preset event stream includes: obtaining a preset log stream, wherein the preset log stream corresponds to a normal log execution path; and using a preset log analysis algorithm to analyze the preset event stream.
  • the logs in the log stream are parsed to obtain multiple log templates, where each log template corresponds to an event; according to the multiple log templates, the preset log stream is converted into a preset log stream corresponding to the preset log stream.
  • the determining the dependent event pair in the preset event stream includes: for each event appearing in the preset event stream, determining a set of candidate successor events corresponding to the current event, and Determine whether the current event and each candidate successor event in the set of candidate successor events corresponding to the current event satisfy a first preset dependency relationship, and determine the candidate successor events that satisfy the first preset dependency relationship It is a successor event and adds the successor event to the successor event set, where the current event and a successor event form a dependent event pair.
  • the determining the set of candidate successor events corresponding to the current event includes: adding the first event that exists between every two adjacent occurrences of the current event in the preset event stream to the initial event.
  • Candidate successor event set includes: calculating the conditional probability of the current event and each first event; removing the second event from the initial candidate successor event set to obtain the candidate successor event set corresponding to the current event, where all The conditional probability of the current event and the second event is less than a preset conditional probability threshold.
  • the judging whether the current event and each candidate subsequent event in the set of candidate subsequent events corresponding to the current event satisfy a first preset dependency relationship includes: For each candidate subsequent event in the set of candidate subsequent events corresponding to the current event, calculate the unconditional distribution of the waiting time of the current candidate subsequent event, calculate the conditional distribution of the waiting time of the current candidate subsequent event relative to the current event, according to The unconditional distribution and the conditional distribution determine whether a first preset dependency relationship is satisfied between the current event and the current candidate successor event, wherein the waiting time represents the occurrence time of the current event and the current The time difference between the occurrence time of the candidate successor event.
  • the determining whether the current event and the current candidate successor event satisfy a first preset dependency relationship according to the unconditional distribution and the conditional distribution includes: In a case where the conditional distribution conforms to a normal distribution, it is determined that the current event and the current candidate successor event satisfy a first preset dependency relationship.
  • the determining the transition interval range corresponding to the dependent event pair includes: for each dependent event pair, obtaining a time difference sequence corresponding to the current dependent event pair in the preset event stream, and The time difference sequence is clustered, and the transition interval range corresponding to the current dependent event pair is determined according to the time distribution in the cluster, wherein the time difference sequence includes that two events in the current dependent event pair occur adjacently Time difference.
  • the determining the transition interval range corresponding to the current dependent event pair according to the time distribution in the cluster category includes: taking the maximum value and the minimum value in the cluster category as the current dependent event pair correspondence Or, determine the transition interval range corresponding to the current dependent event pair according to the confidence interval of the time distribution in the cluster class.
  • the determining the transition interval range corresponding to the current dependent event pair according to the time distribution in the cluster category includes: using a preset statistical test method to test the current dependent event pair, if the test If passed, the transfer interval range corresponding to the current dependent event pair is determined according to the time distribution in the cluster class.
  • the device further includes: an edge completion module, configured to, after the maximum spanning tree is generated, for two nodes corresponding to any dependent event pair in the maximum spanning tree, calculate the current two There is a detour probability of a detour path for each node, and if the detour probability is greater than a preset detour probability threshold, then the edge between the two current nodes is complemented.
  • an edge completion module configured to, after the maximum spanning tree is generated, for two nodes corresponding to any dependent event pair in the maximum spanning tree, calculate the current two There is a detour probability of a detour path for each node, and if the detour probability is greater than a preset detour probability threshold, then the edge between the two current nodes is complemented.
  • the complementing the edges between the current two nodes includes: taking the sum of the weights of all edges traversed by the target detour path between the current two nodes as the The weights of the edges between the current two nodes are used for edge completion, where the target detour path is the path with the largest sum of the weights of all the edges passed.
  • Fig. 6 is a structural block diagram of an abnormality detection device provided by an embodiment of the application.
  • the device can be implemented by software and/or hardware, and can generally be integrated in a server, and log abnormality detection can be performed by executing an abnormality detection method.
  • the device includes: an event stream acquisition module 601 to be detected, configured to acquire an event stream to be detected, wherein the event stream to be detected corresponds to the log execution path to be detected; and a comparison module 602 is set to The flow of the events to be detected is compared with the event topology diagram, where the event topology diagram is generated using the topology diagram generation method provided in the embodiment of the present application; the anomaly detection module 603 is configured to determine the to-be-detected event flow according to the comparison result. Check whether the event stream is abnormal.
  • the anomaly detection device provided in the embodiment of the application compares the event flow to be detected corresponding to the log execution path to be detected with the event topology map generated by the topology map generation method provided in the embodiment of the application, and according to the comparison result It can quickly and accurately detect whether there is an abnormality in the event stream to be detected, which can improve the accuracy and efficiency of log abnormality detection.
  • the event topology graph includes a plurality of nodes, the nodes in the plurality of nodes represent events in the dependent event pair, and the relationship between two nodes in the plurality of nodes
  • the connection relationship includes the transition probability and transition interval range corresponding to the dependent event pairs represented by the two nodes; the flow of events to be detected is compared with the event topology graph, and the flow of events to be detected is determined according to the comparison result Whether there is an abnormality, including: for the current event in the event stream to be detected, search for the corresponding target event in the event topology diagram; the next event of the current event does not correspond to the child node of the target event In the case of, it is determined that there is an abnormality in the event stream to be detected.
  • the abnormality detection module 603 is further configured to: in the case that the next event of the current event corresponds to the first child node of the target event, obtain the current event and the next The first time interval between events; acquire the transfer interval range corresponding to the target event and the first child node; in the case that the first time interval is not within the transfer interval range, determine the waiting The detection event stream is abnormal.
  • FIG. 7 is a structural block diagram of a computer device provided by an embodiment of this application.
  • the computer device 700 may include: a memory 701, a processor 702, and a computer program stored on the memory 701 and running on the processor 702.
  • the processor 702 implements the topology as described in the embodiment of the present application when the processor 702 executes the computer program. Graph generation method and/or anomaly detection method.
  • the embodiment of the present application also provides a storage medium containing computer-executable instructions, when the computer-executable instructions are executed by a computer processor, they are used to execute the topology map generation method and/or anomaly detection provided by any embodiment of the present application method.
  • the topology map generation device, anomaly detection device, computer equipment, and storage medium provided in the foregoing embodiments can execute the methods provided in the corresponding embodiments of the present application, and have functional modules corresponding to the execution methods.
  • functional modules corresponding to the execution methods For technical details not described in the foregoing embodiments, please refer to the methods provided in the corresponding embodiments of the present application.
  • computer equipment encompasses any suitable type of equipment capable of executing computer programs, such as mobile phones, portable data processing devices, portable web browsers, or vehicle-mounted mobile stations.
  • the various embodiments of the present application can be implemented in hardware or dedicated circuits, software, logic or any combination thereof.
  • some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor, or other computing device, although the present application is not limited thereto.
  • Computer program instructions can be assembly instructions, Instruction Set Architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or written in any combination of one or more programming languages Source code or object code.
  • ISA Instruction Set Architecture
  • the block diagram of any logic flow in the drawings of the present application may represent program steps, or may represent interconnected logic circuits, modules, and functions, or may represent a combination of program steps and logic circuits, modules, and functions.
  • the computer program can be stored on the memory.
  • the memory can be of any type suitable for the local technical environment and can be implemented using any suitable data storage technology, such as but not limited to read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), optical Memory devices and systems (Digital Video Disk (DVD) or Compact Disc (CD)), etc.
  • Computer-readable media may include non-transitory storage media.
  • the data processor can be any type suitable for the local technical environment, such as but not limited to general-purpose computers, special-purpose computers, microprocessors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (ASICs) ), programmable logic devices (Field Programmable Gate Array, FPGA), and processors based on multi-core processor architecture.
  • DSP Digital Signal Processors
  • ASICs application specific integrated circuits
  • FPGA Field Programmable Gate Array
  • processors based on multi-core processor architecture such as but not limited to general-purpose computers, special-purpose computers, microprocessors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (ASICs) ), programmable logic devices (Field Programmable Gate Array, FPGA), and processors based on multi-core processor architecture.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请提出拓扑图生成方法、异常检测方法、装置、设备及存储介质。其中,拓扑图生成方法包括:获取预设事件流,其中,预设事件流对应于正常的日志执行路径,确定预设事件流中的依赖事件对,确定依赖事件对对应的转移间隔范围,其中,所述转移间隔表示依赖事件对中的两个事件相邻发生的时间差,根据依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,其中,转移概率表示依赖事件对中的两个事件之间的条件概率。

Description

拓扑图生成方法、异常检测方法、装置、设备及存储介质
本申请要求在2019年12月3日提交中国专利局、申请号为201911222482.4的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,例如涉及一种拓扑图生成方法、异常检测方法、装置、设备及存储介质。
背景技术
为了满足用户日渐丰富的需求,现代软件系统变得越来越庞大和复杂,当软件系统出现异常时,能够检测出问题并找到原因至关重要。系统日志记录系统重要事件以及系统状态,帮助系统运维人员调试系统性能问题和异常,是理解系统状态的宝贵资源。然而,系统日志是非结构化的,通过运维人员的经验从错综复杂的系统日志中准确而高效地定位出系统异常事件是非常困难的。相关技术中的相关方案难以准确地进行日志异常检测,需要改进。
发明内容
本申请提供拓扑图生成方法、异常检测方法、装置、设备及存储介质。
本申请实施例提供一种拓扑图生成方法,包括:获取预设事件流,其中,所述预设事件流对应于正常的日志执行路径;确定所述预设事件流中的依赖事件对;确定所述依赖事件对对应的转移间隔范围,其中,所述转移间隔表示所述依赖事件对中的两个事件相邻发生的时间差;根据所述依赖事件对对应的转移概率和所述转移间隔范围生成事件拓扑图,其中,所述转移概率表示所述依赖事件对中的两个事件之间的条件概率。
本申请实施例提供一种异常检测方法,包括:获取待检测事件流,其中,所述待检测事件流对应于待检测的日志执行路径;将所述待检测事件流与事件拓扑图进行比对,其中,所述事件拓扑图采用本申请实施例提供的拓扑图生成方法生成;根据比对结果确定所述待检测事件流是否存在异常。
本申请实施例提供一种拓扑图生成装置,包括:预设事件流获取模块,设置为获取预设事件流,其中,所述预设事件流对应于正常的日志执行路径;依 赖事件对确定模块,设置为确定所述预设事件流中的依赖事件对;转移间隔范围确定模块,设置为确定所述依赖事件对对应的转移间隔范围,其中,所述转移间隔表示所述依赖事件对中的两个事件发生时间的时间差;拓扑图生成模块,设置为根据所述依赖事件对对应的转移概率和所述转移间隔范围生成事件拓扑图,其中,所述转移概率表示所述依赖事件对中的两个事件之间的条件概率。
本申请实施例提供一种异常检测装置,包括:待检测事件流获取模块,设置为获取待检测事件流,其中,所述待检测事件流对应于待检测的日志执行路径;比对模块,设置为将所述待检测事件流与事件拓扑图进行比对,其中,所述事件拓扑图采用本申请实施例提供的拓扑图生成方法生成;异常检测模块,设置为根据比对结果确定所述待检测事件流是否存在异常。
本申请实施例提供一种计算机设备,包括:处理器以及存储器;所述处理器设置为执行存储器中存储的程序,以本申请实施例中的任意一种方法。
本申请实施例提供了一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例中的任意一种方法。
关于本申请的以上实施例和其他方面以及其实现方式,在附图说明、实施方式和权利要求中提供更多说明。
附图说明
图1为本申请实施例提供的一种拓扑图生成方法的流程示意图;
图2为本申请实施例提供的又一种拓扑图生成方法的流程示意图;
图3为本申请实施例提供的一种异常检测方法的流程示意图;
图4为本申请实施例提供的又一种异常检测方法的流程示意图;
图5为本申请实施例提供的一种拓扑图生成装置的结构框图;
图6为本申请实施例提供的一种异常检测装置的结构框图;
图7为本申请实施例提供的一种计算机设备的结构框图。
具体实施方式
下文中将结合附图对本申请的实施例进行说明。
图1为本申请实施例提供的一种拓扑图生成方法的流程示意图,该方法可 以由拓扑图生成装置执行,其中该装置可由软件和/或硬件实现,一般可集成在计算机设备中。如图1所示,该方法包括以下步骤。
步骤101、获取预设事件流,其中,所述预设事件流对应于正常的日志执行路径。
系统日志是非结构化的,在不同的系统中,日志结构也往往不一样,且软件系统可以并行地执行多项任务,造成了输出的系统日志常常交错在一起,系统操作人员在日志运维的过程中,往往对该领域的专业知识不具有全面的了解,难以准确设置该领域的参数,其次系统日志数据错综复杂且庞大,因此,往往难以达到令人满意的准确率。相关技术中存在一种基于工作流的方案,但是缺乏能够准确地从大量错综复杂的日志信息中挖掘出表示系统正常执行流程的工作流图的方案。
本申请实施例中,先获取对应于正常的日志执行路径的预设事件流(也称预设事务流),然后基于预设事件流进行分析和挖掘,最终生成用于日志异常分析的事件拓扑图,能够准确地表示系统正常执行流程。
示例性的,可以先获取到对应于正常的日志执行路径的预设日志流(相当于原始日志流),然后依据日志模板将预设日志流转化为对应的预设事件流。系统日志可以分为两部分:固定部分和变量。固定部分是原始日志条目的固定部分,不随系统状态的改变而改变,变量部分是会随着系统状态的改变而改变,日志模板由原始日志抽象而来,例如可以将变量部分用如占位符*代替,每一个日志模板可以尽可能地对应一个日志输出语句,也即每个日志模板对应一个事件,或者说,对应一个事件类型。对于预设日志流中的两个以上的日志可能对应同样的日志模板,也就是说,在预设日志流中同一个事件可能发生两次以上。为了便于说明,本申请实施例中所提到的事件可以认为是一个事件类型,如事件A,预设事件流中每次出现的事件A可认为是事件A的一个实例。
日志模板可以是预先配置好的,也可以是通过日志解析的方式将非结构化的预设日志流中的日志解析为结构化的日志模板。
步骤102、确定预设事件流中的依赖事件对。
本步骤中,挖掘出预设事件流中的依赖事件对。事件对是构成事件流图的基本单元,对于一个事件对,两个事件的依赖关系的特征为所述两个事件之间的时间关系,例如事件对(A,B),表示在事件A发生之后往往有事件B发生。依赖事件对可以认为是满足设定依赖关系的事件对。依赖事件对中发生在前的事件可称为前驱事件,发生在后的事件可称为后继事件。
可选的,可以根据两个事件的条件概率来确定预设事件流中的依赖事件对。
可选的,可以将所确定的依赖事件对添加至依赖事件对集合。
步骤103、确定所述依赖事件对对应的转移间隔范围,其中,所述转移间隔表示所述依赖事件对中的两个事件相邻发生的时间差。
示例性的,转移间隔也即转移时间的间隔,可以有效地反映系统事件转移时间的波动性,可以分析预设事件流中每个依赖事件对对应的所有实例发生的时间差来确定转移间隔范围,转移间隔范围可以表示依赖事件对中的两个事件相邻发生的时间差所处于的正常范围,转移间隔范围的确定方式不做限定。
步骤104、根据所述依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,其中,所述转移概率表示所述依赖事件对中的两个事件之间的条件概率。
示例性的,可以根据依赖事件对对应的转移概率和转移间隔范围这两个维度生成设定结构的事件拓扑图(Event Topology Graph,ETG),该设定结构例如可以是树结构。
在一实施例中,所述事件拓扑图中包含多个节点,所述多个节点中的节点表示所述依赖事件对中的事件,所述多个节点中的两个节点之间的连接关系中包含所述两个节点所代表的依赖事件对对应的转移概率和转移间隔范围。也即,在事件拓扑图中,节点可以为依赖事件对中的事件,在两个节点之间的连接线上可以标注两个节点所代表的依赖事件对对应的转移概率和转移间隔范围。
这样所生成的事件拓扑图包含了正常的日志事件流在事件发生的条件概率以及发生的时间间隔两个维度上的标准信息,在用于日志异常检测时,就能够在这两个维度进行检测,提高检测的准确度。
本申请实施例提供的拓扑图生成方法,获取对应于正常的日志执行路径的预设事件流,确定预设事件流中的依赖事件对,以及确定依赖事件对对应的转移间隔范围,根据所述依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,其中,所述转移概率表示两个事件之间的条件概率。通过采用上述技术方案,所生成的事件拓扑图包含了正常的日志事件流在事件发生的条件概率以及发生的时间间隔两个维度上的标准信息,在用于日志异常检测时,就能够在这两个维度进行检测,提高检测的准确度。
在一个示例性实施方式中,所述根据所述依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,包括:以所述依赖事件对中包含的事件为节点,以所述依赖事件对对应的转移概率为节点之间的边的权重,生成最大生成树;在所述最大生成树中的边上添加所述依赖事件对对应的转移间隔范围,得到事件拓扑图。这样设置的好处在于,能够生成高效合理的事件拓扑图结构,有利于提升日志异常检测的准确度。示例性的,在图论中,一个图的生成树是包含 所有节点的一个子图,通常表示为一棵树,最大生成树是有权值图的有着最大权值的生成树。本申请实施例中,可以路径之间的转移概率最大为目标函数来生成最大生成树,并在最大生成树的边上添加对应的转移间隔范围,也即,对于两个节点之间的边,将该两个节点对应的依赖事件对对应的转移间隔范围标记到该边上。
在一个示例性实施方式中,所述获取预设事件流,包括:获取预设日志流,其中,所述预设日志流对应于正常的日志执行路径;利用预设日志解析算法对所述预设日志流中的日志进行解析,得到多个日志模板,其中,每个日志模板对应一个事件;依据所述多个日志模板将所述预设日志流转化为所述预设日志流对应的预设事件流。这样设置的好处在于,基于预设日志流生成日志模板,使得日志模板与预设日志流中的日志匹配度更好,从而更准确地得到对应的事件流。预设日志解析算法例如可以是基础签名生成(Basic Signature Generation,BSG)、LKE以及迭代日志划分挖掘(Iterative Partitioning Log Mining,IPLoM)等。
示例性的,预设日志流可以表示为<l 1,l 2,…l n>,假设如表1所示:
表1预设日志流
Figure PCTCN2020130033-appb-000001
Figure PCTCN2020130033-appb-000002
利用预设日志解析算法对预设日志流进行解析,可以得到对应的日志模板<e 1,e 2,…e m>,m<n,每个日志模板代表一种事件类型,比如表1的预设日志流经过日志解析后得到的日志模板如表2所示:
Figure PCTCN2020130033-appb-000003
template表示日志模板,Event_id表示对应的事件类型。
根据日志模板将预设日志流转化为预设事件流。如上述举例,预设日志流l 1,l 2,l 3,l 4,l 5,l 6对应的事务流为e 1,e 2,e 2,e 3,e 4,e 4
在一个示例性实施方式中,所述确定所述预设事件流中的依赖事件对,包括:对于所述预设事件流中出现的每个事件,确定当前事件对应的候选后继事件集合,并判断所述当前事件与所述当前事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系,将满足所述第一预设依赖关系的候选后继事件确定为后继事件并将后继事件加入后继事件集合,其中,所述当前事件与一个后继事件形成一个依赖事件对。这样设置的好处在于,对于每个事件,可将所述每个事件作为前驱事件,并初步确定对应的候选后继事 件集合,然后再判断候选后继事件集合中的事件是否能够成为对应的后继事件,可以提升确定依赖事件对的效率。
在一个示例性实施方式中,所述确定当前事件对应的候选后继事件集合,包括:将在所述预设事件流中所述当前事件每两次相邻出现之间存在的第一事件加入初始候选后继事件集合;计算所述当前事件和每个第一事件的条件概率;将第二事件从所述初始候选后继事件集合中去除,得到所述当前事件对应的候选后继事件集合,其中,所述当前事件和所述第二事件的条件概率小于预设条件概率阈值。其中,条件概率又可称为相关概率,预设条件概率阈值可以根据实际情况进行设置。这样设置的好处在于,可以利用条件概率过滤出噪音事件,尽可能地去除前驱事件的间接的后继事件,保留前驱事件的直接后继事件,提高确定依赖事件对的准确性。
可选的,可采用如下公式计算所述当前事件和每个第一事件的条件概率:
SUP (A|B)=N (A|B)/min(p A,p B)*sigmoid(min(p A,p B))
A表示当前事件,即前驱事件,B表示A对应的初始候选后继事件集合的第一事件,p A是事件A发生的概率,p B是事件B发生的概率,N (A|B)是事件B在事件A对应的初始候选后继事件集合的发生次数,也即第一事件在当前事件每两次相邻出现之间发生的次数。
在一个示例性实施方式中,所述判断所述当前事件与所述当前事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系,包括:对于与所述当前事件对应的候选后继事件集合中的每个候选后继事件,计算当前候选后继事件的等待时间的无条件分布,计算所述当前候选后继事件相对于所述当前事件的等待时间的条件分布,根据所述无条件分布和所述条件分布确定所述当前事件与所述当前候选后继事件之间是否满足第一预设依赖关系,其中,所述等待时间表示所述当前事件的发生时间与所述当前候选后继事件的发生时间的时间差。这样设置的好处在于,可以更加准确地确定依赖事件对。
示例性的,抽取事件所对应的时间序列,将事件A发生的时间序列表示为:S A=<a 1,a 2,…,a m>,其中,a i,1≤i≤m为事件类型为A的日志条目的时间戳。假设S A的范围在[0,T]之间,给定一个时间点z,z和S A之间的最小正距离即等待时间为d(z,S A)=min||x-z||,x∈S A,x≥z,事件B的等待时间的无条件分布为F B(r)=P(d(z,S B))≤r,其中r是时间间隔的阈值参数,这里的z对应任意一个事件。事件B相对于事件A的等待时间的条件分布为:F B|A(r)=P(d(z,S B))≤r,z∈S A,x为序列S A中任意一点,这里的z对应S A中任意一点,F B|A描述了事件A在任意一时间点x的条件概率。
在一个示例性实施方式中,所述根据所述无条件分布和所述条件分布确定所述当前事件与所述当前候选后继事件之间是否满足第一预设依赖关系,包括:在所述无条件分布和所述条件分布符合正态分布的情况下,确定所述当前事件与所述当前候选后继事件之间满足第一预设依赖关系。这样设置的好处在于,可以快速准确地确定两个事件是否为依赖事件对。可选的,还可以依据正态分布以外的其他分布来衡量当前事件与当前候选后继事件之间是否满足第一预设依赖关系。
示例性的,对于事件对(A,B),可以通过比较F B(r)和F B|A(r)来判断事件A和事件B之间是否有依赖关系,如果F B(r)和F B|A(r)显著不同,则认为事件B依赖于事件A。可以计算M B和M B|A是否符合正态分布来判断,M B和M B|A分别表示F B(r)和F B|A(r)的初值,也即一阶矩。这里的正态分布可理解为通过相邻事件的点序列来判断事件是否具有依赖关系。
在一个示例性实施方式中,所述确定所述依赖事件对对应的转移间隔范围,包括:针对每个依赖事件对,获取在所述预设事件流中当前依赖事件对对应的时间差序列,对所述时间差序列进行聚类,根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围,其中,所述时间差序列中包含所述当前依赖事件对中的两个事件相邻发生的时间差。这样设置的好处在于,通过聚类的方式可以快速有效地去除冗余事件,提高确定转移间隔范围的准确性。对于依赖事件对中的两个事件C和D,在预设事件流中一般会多次出现,当C和D第一次相邻发生时,计算两者发生的第一时间差,成为时间差序列中的第一个元素,当C和D第二次相邻发生时,计算两者发生的第二时间差,成为时间差序列中的第二个元素,依次类推,得到该依赖事件对对应的时间差序列。
在一个示例性实施方式中,所述根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围,包括:根据簇类中的最大值和最小值确定所述当前依赖事件对对应的转移间隔范围;或者,根据簇类中的时间分布的置信区间确定所述当前依赖事件对对应的转移间隔范围。这样设置的好处在于,可以快速确定每个依赖事件对应的转移间隔范围。
示例性的,获取依赖事件对之间的转移间隔序列。例如,对于挖掘到的每一个依赖事件对<T i,T j>,找出预设事件流中所有相邻的T i和T j,记录所有相邻的T i和T j之间的时间差为序列<t 1,t 2,…,t m>。对时间差序列采用聚类算法,通过聚类的方法可以去除冗余事件,可以采用的聚类算法有凝聚的层次聚类算法(AGglomerative NESting,AGNES)、分裂的层次聚类(DIvisive ANAlysis,DIANA)、以及具有噪声的基于密度的聚类方法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)等,取每一个簇最大值和最小值,作为属 于该簇的事件对的时间间隔范围。
在一个示例性实施方式中,所述根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围,包括:利用预设统计检验方法对所述当前依赖事件对进行检验,若检验通过,则根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围。这样设置的好处在于,可以对依赖事件对的依赖关系在时间上进行验证。预设统计检验方法可包括卡方检验、z检验以及t检验等。示例性的,用卡方检验检验每一个簇,如果所述每一个簇通过卡方检验,则证明事件对之间有依赖关系,并将前述步骤中计算出的时间间隔作为事件对的转移时间间隔范围。
在一个示例性实施方式中,在所述生成最大生成树之后,还包括:对于所述最大生成树中的任意依赖事件对对应的两个节点,计算当前两个节点存在绕行路径的绕行概率,若所述绕行概率大于预设绕行概率阈值,则补全所述当前两个节点之间的边。这样设置的好处在于,可以完善所生成的最大生成树,提高事件拓扑图的准确性。
示例性的,计算生成树节点之间的路径长度,生成树任意两点(起点和终点)分别记为E 1,E 2,之间的路径长度为path(E 2,E 2),在E 1,E 2之间存在绕行路径的概率为:d(E 1,E 2)=log 1+path(E 1,E 2)),根据日志序列设置合适的预设绕行概率阈值来决定是否补上从E 1到E 2的路径。
在一个示例性实施方式中,所述补全所述当前两个节点之间的边,包括:将所述当前两个节点之间的目标绕行路径经过的所有边上的权重之和作为所述当前两个节点之间的边的权重进行边补全,其中,所述目标绕行路径为所经过的所有边的权重之和最大的路径。这样设置的好处在于,可以快速准确地补全生成树中存在依赖关系却没有边的节点之间的缺失路径。
图2为本申请实施例提供的又一种拓扑图生成方法的流程示意图,如图2所示,该方法包括以下步骤。
步骤201、获取预设日志流,并将预设日志流转化为对应的预设事件流。
步骤202、对于预设事件流中出现的每个事件,确定当前事件对应的候选后继事件集合,并判断当前事件与当前事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系,根据判断结果确定依赖事件对。
若判断出当前事件与一个候选后继事件之间满足第一预设依赖关系,则确定当前事件与所述一个候选后继事件为依赖事件对。
可选的,将在预设事件流中当前事件每两次相邻出现之间存在的第一事件加入初始候选后继事件集合,计算当前事件和每个第一事件的条件概率,将条 件概率小于预设条件概率阈值的第二事件从初始候选后继事件集合中去除,得到当前事件对应的候选后继事件集合。
可选的,对于与当前事件对应的候选后继事件集合中的每个候选后继事件,计算当前候选后继事件的等待时间的无条件分布,计算当前候选后继事件相对于当前事件的等待时间的条件分布,在所述无条件分布和所述条件分布符合正态分布的情况下,确定当前事件与当前候选后继事件之间满足第一预设依赖关系,也即当前事件和当前候选后继事件为一个依赖事件对。
步骤203、针对每个依赖事件对,获取在预设事件流中当前依赖事件对对应的时间差序列,对时间差序列进行聚类,根据簇类中的最大值和最小值确定当前依赖事件对对应的转移间隔范围。
在对时间差序列进行聚类之后,还可包括利用卡方检验检验当前依赖事件对对应的簇,若检验通过,则根据当前依赖事件对应的簇中的最大值和最小值确定当前依赖事件对对应的转移间隔范围。可选的,若检验未通过,可认为当前依赖事件对在时间上不具备依赖关系,从依赖事件对集合中删除当前依赖事件对。
步骤204、以所述依赖事件对中包含的事件为节点,以所述依赖事件对对应的转移概率为节点之间的边的权重,生成最大生成树。
示例性的,树中节点代表一个事件,边上的权重代表连接的前驱事件和后继事件之间的转移概率,生成树作为整个工作流的骨架。可用的生成生成树的算法有普里姆算法(Prim)和克鲁斯卡尔算法(Kruskal),可以对这些算法进行变形,如以路径之间的转移概率最大为目标函数来生成最大生成树。
步骤205、对于最大生成树中的任意依赖事件对对应的两个节点,计算当前两个节点存在绕行路径的绕行概率,若绕行概率大于预设绕行概率阈值,则将当前两个节点之间的目标绕行路径经过的所有边上的权重之和作为当前两个节点之间的边的权重进行边补全。
步骤206、在经过边补全处理的最大生成树中的边上添加对应的转移间隔范围,得到用于日志异常检测的事件拓扑图。
本申请实施例提供的拓扑图生成方法,获取对应于正常的日志执行路径预设日志流,并将预设日志流转化为预设事件流,挖掘预设事件流中的依赖事件对,并通过聚类的方式确定依赖事件对对应的转移间隔范围,以依赖事件对中包含的事件为节点,以依赖事件对对应的转移概率为节点之间的边的权重,生成最大生成树,在对缺失路径进行补全后,在树的边上添加对应的转移间隔范围,得到用于日志异常检测的事件拓扑图。通过采用上述技术方案,所生成的 树状的事件拓扑图包含了正常的日志事件流在事件发生的条件概率以及发生的时间间隔两个维度上的标准信息,在用于日志异常检测时,能够提高检测的准确度以及提升检测效率。
图3为本申请实施例提供的一种异常检测方法的流程示意图,该方法可以由异常检测装置执行,其中该装置可由软件和/或硬件实现,一般可集成在计算机设备中。如图3所示,该方法包括以下步骤。
步骤301、获取待检测事件流,其中,待检测事件流对应于待检测的日志执行路径。
示例性的,待检测事件流可以是系统新产生的日志流转化而来的,也可以是历史产生的需要进行异常检测的日志流转化而来的。可以采用与生成事件拓扑图时采用的日志模板将待检测日志流转化为对应的待检测事件流。
步骤302、将待检测事件流与事件拓扑图进行比对。
所述事件拓扑图采用本申请实施例提供的拓扑图生成方法生成。事件拓扑图的生成过程可视为异常检测的离线阶段,生成高质量的事件拓扑图后,可以用事件拓扑图代表系统的正常执行路径,在在线阶段,比较待检测事件流与事件拓扑图,来分析异常。
步骤303、根据比对结果确定所述待检测事件流是否存在异常。
示例性的,通过比对待检测事件流与事件拓扑图之间的差别,可以发现当前待检测的日志执行路径是否与正常的日志执行路径存在差异,进而确定是否存在异常。
本申请实施例提供的异常检测方法,将对应于待检测的日志执行路径的待检测事件流与采用本申请实施例提供的拓扑图生成方法所生成的事件拓扑图进行比对,根据比对结果可以快速准确地检测出待检测事件流是否存在异常,可以提高日志异常检测的准确性和效率。
在一个示例性实施方式中,所述事件拓扑图中包含多个节点,所述多个节点中的节点表示所述依赖事件对中的事件,所述多个节点中的两个节点之间的连接关系中包含所述两个节点所代表的依赖事件对对应的转移概率和转移间隔范围。将所述待检测事件流与事件拓扑图进行比对,根据比对结果确定所述待检测事件流是否存在异常,包括:对于所述待检测事件流中的当前事件,在所述事件拓扑图中查找对应的目标事件;在所述当前事件的下一个事件未对应于所述目标事件的子节点的情况下,确定所述待检测事件流存在异常。目标事件可以理解为存在于事件拓扑图中的与当前事件类型相同的事件。若当前事件的下一个事件与目标事件的任意一个子节点上的事件类型相同,则认为当前事件 的下一个事件对应于目标事件的子节点。一个节点的子节点可以理解为与该节点相连且处于该节点之后的节点。以树结构为例,一个节点的子节点为该节点的分支节点。这样设置的好处在于,可以先基于两个连续发生的事件的条件概率来验证待检测事件流中在事件发生顺序层面是否存在异常,从而进一步快速准确地进行异常检测。
在一个示例性实施方式中,该方法还包括:在所述当前事件的下一个事件对应于所述目标事件的第一子节点的情况下,获取所述当前事件与所述下一个事件之间的第一时间间隔;获取所述目标事件与所述第一子节点对应的转移间隔范围;在所述第一时间间隔未处于所述转移间隔范围内的情况下,确定所述待检测事件流存在异常。这样设置的好处在于,在基于两个连续发生的事件的条件概率的验证通过后,检测两个事件发生的时间间隔是否处于合理范围内,从而提高异常检测的准确性。
图4为本申请实施例提供的一种异常检测方法的流程示意图,如图4所示,在日志检测的在线阶段,读入事务流e1,e2,...en,也即获取待检测事件流。对于当前事件ei,判断ei是否存在于事件拓扑图中,若ei未在事件拓扑图中,则将事务流中下一个事件作为新的当前事件重新判断,若ei在事件拓扑图中,则继续判断ei的下一个事件ei+1是否在事件拓扑图中的ei的子节点中。若ei+1未在事件拓扑图中的ei的子节点中,则输出执行路径异常;若ei+1在事件拓扑图中的ei的子节点中,则继续判断ei和ei+1之间的间隔是否在指定时间间隔内,也即判断ei和ei+1之间的间隔是否处于对应的转移间隔范围内。ei和ei+1之间的间隔未在指定时间间隔内,则输出执行路径异常;ei和ei+1之间的间隔在指定时间间隔内,则判断事务流中是否还有其他事件。若仍存在其他事件,则将ei+1作为新的ei重复进行判断;若不存在其他事件,则结束流程。
图5为本申请实施例提供的一种拓扑图生成装置的结构框图,该装置可由软件和/或硬件实现,一般可集成在计算机设备中,可通过执行拓扑图生成方法来生成事件拓扑图。如图5所示,该装置包括:预设事件流获取模块501,设置为获取预设事件流,其中,所述预设事件流对应于正常的日志执行路径;依赖事件对确定模块502,设置为确定所述预设事件流中的依赖事件对;转移间隔范围确定模块503,设置为确定所述依赖事件对对应的转移间隔范围,其中,所述转移间隔表示所述依赖事件对中的两个事件发生时间的时间差;拓扑图生成模块504,设置为根据所述依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,其中,所述转移概率表示所述依赖事件对中的两个事件之间的条件概率。
本申请实施例提供的拓扑图生成装置,获取对应于正常的日志执行路径预设事件流,确定预设事件流中的依赖事件对,以及确定依赖事件对对应的转移 间隔范围,根据所述依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,其中,所述转移概率表示两个事件之间的条件概率,其中,所述转移概率表示两个事件之间的条件概率。通过采用上述技术方案,所生成的事件拓扑图包含了正常的日志事件流在事件发生的条件概率以及发生的时间间隔两个维度上的标准信息,在用于日志异常检测时,就能够在这两个维度进行检测,提高检测的准确度。
在一个示例性实施方式中,所述事件拓扑图中包含多个节点,所述多个节点中的节点表示所述依赖事件对中的事件,所述多个节点中的两个节点之间的连接关系中包含所述两个节点所代表的依赖事件对对应的转移概率和转移间隔范围。
在一个示例性实施方式中,所述根据所述依赖事件对对应的转移概率和转移间隔范围生成事件拓扑图,包括:以所述依赖事件对中包含的事件为节点,以所述依赖事件对对应的转移概率为节点之间的边的权重,生成最大生成树;在所述最大生成树中的边上添加所述依赖事件对应的转移间隔范围,得到事件拓扑图。
在一个示例性实施方式中,所述获取预设事件流,包括:获取预设日志流,其中,所述预设日志流对应于正常的日志执行路径;利用预设日志解析算法对所述预设日志流中的日志进行解析,得到多个日志模板,其中,每个日志模板对应一个事件;依据所述多个日志模板将所述预设日志流转化为所述预设日志流对应的预设事件流。
在一个示例性实施方式中,所述确定所述预设事件流中的依赖事件对,包括:对于所述预设事件流中出现的每个事件,确定当前事件对应的候选后继事件集合,并判断所述当前事件与所述当前事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系,将满足所述第一预设依赖关系的候选后继事件确定为后继事件并将后继事件加入后继事件集合,其中,所述当前事件与一个后继事件形成一个依赖事件对。
在一个示例性实施方式中,所述确定当前事件对应的候选后继事件集合,包括:将在所述预设事件流中所述当前事件每两次相邻出现之间存在的第一事件加入初始候选后继事件集合;计算所述当前事件和每个第一事件的条件概率;将第二事件从所述初始候选后继事件集合中去除,得到所述当前事件对应的候选后继事件集合,其中,所述当前事件和所述第二事件的条件概率小于预设条件概率阈值。
在一个示例性实施方式中,所述判断所述当前事件与所述当前事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系, 包括:对于与所述当前事件对应的候选后继事件集合中的每个候选后继事件,计算当前候选后继事件的等待时间的无条件分布,计算所述当前候选后继事件相对于所述当前事件的等待时间的条件分布,根据所述无条件分布和所述条件分布确定所述当前事件与所述当前候选后继事件之间是否满足第一预设依赖关系,其中,所述等待时间表示所述当前事件的发生时间与所述当前候选后继事件的发生时间的时间差。
在一个示例性实施方式中,所述根据所述无条件分布和所述条件分布确定所述当前事件与所述当前候选后继事件之间是否满足第一预设依赖关系,包括:在所述无条件分布和所述条件分布符合正态分布的情况下,确定所述当前事件与所述当前候选后继事件之间满足第一预设依赖关系。
在一个示例性实施方式中,所述确定所述依赖事件对对应的转移间隔范围,包括:针对每个依赖事件对,获取在所述预设事件流中当前依赖事件对对应的时间差序列,对所述时间差序列进行聚类,根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围,其中,所述时间差序列中包含所述当前依赖事件对中的两个事件相邻发生的时间差。
在一个示例性实施方式中,所述根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围,包括:根据簇类中的最大值和最小值作为所述当前依赖事件对对应的转移间隔范围;或者,根据簇类中的时间分布的置信区间确定所述当前依赖事件对对应的转移间隔范围。
在一个示例性实施方式中,所述根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围,包括:利用预设统计检验方法对所述当前依赖事件对进行检验,若检验通过,则根据簇类中的时间分布确定所述当前依赖事件对对应的转移间隔范围。
在一个示例性实施方式中,该装置还包括:边补全模块,设置为在所述生成最大生成树之后,对于所述最大生成树中的任意依赖事件对对应的两个节点,计算当前两个节点存在绕行路径的绕行概率,若所述绕行概率大于预设绕行概率阈值,则补全所述当前两个节点之间的边。
在一个示例性实施方式中,所述补全所述当前两个节点之间的边,包括:将所述当前两个节点之间的目标绕行路径经过的所有边上的权重之和作为所述当前两个节点之间的边的权重进行边补全,其中,所述目标绕行路径为所经过的所有边的权重之和最大的路径。
图6为本申请实施例提供的一种异常检测装置的结构框图,该装置可由软件和/或硬件实现,一般可集成在服务器中,可通过执行异常检测方法来进行日 志异常检测。如图6所示,该装置包括:待检测事件流获取模块601,设置为获取待检测事件流,其中,所述待检测事件流对应于待检测的日志执行路径;比对模块602,设置为将所述待检测事件流与事件拓扑图进行比对,其中,所述事件拓扑图采用本申请实施例提供的拓扑图生成方法生成;异常检测模块603,设置为根据比对结果确定所述待检测事件流是否存在异常。
本申请实施例提供的异常检测装置,将对应于待检测的日志执行路径的待检测事件流与采用本申请实施例提供的拓扑图生成方法所生成的事件拓扑图进行比对,根据比对结果可以快速准确地检测出待检测事件流是否存在异常,可以提高日志异常检测的准确性和效率。
在一个示例性实施方式中,所述事件拓扑图中包含多个节点,所述多个节点中的节点表示所述依赖事件对中的事件,所述多个节点中的两个节点之间的连接关系中包含所述两个节点所代表的依赖事件对对应的转移概率和转移间隔范围;将所述待检测事件流与事件拓扑图进行比对,根据比对结果确定所述待检测事件流是否存在异常,包括:对于所述待检测事件流中的当前事件,在所述事件拓扑图中查找对应的目标事件;在所述当前事件的下一个事件未对应于所述目标事件的子节点的情况下,确定所述待检测事件流存在异常。
在一个示例性实施方式中,异常检测模块603还设置为:在所述当前事件的下一个事件对应于所述目标事件的第一子节点的情况下,获取所述当前事件与所述下一个事件之间的第一时间间隔;获取所述目标事件与所述第一子节点对应的转移间隔范围;在所述第一时间间隔未处于所述转移间隔范围内的情况下,确定所述待检测事件流存在异常。
本申请实施例提供了一种计算机设备,该计算机设备中可集成本申请实施例提供的拓扑图生成装置和/或异常检测装置。图7为本申请实施例提供的一种计算机设备的结构框图。计算机设备700可以包括:存储器701,处理器702及存储在存储器701上并可在处理器702运行的计算机程序,所述处理器702执行所述计算机程序时实现如本申请实施例所述的拓扑图生成方法和/或异常检测方法。
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行本申请任意实施例所提供的拓扑图生成方法和/或异常检测方法。
上述实施例中提供的拓扑图生成装置、异常检测装置、计算机设备以及存储介质可执行本申请相应实施例所提供的方法,具备执行方法相应的功能模块。 未在上述实施例中描述的技术细节,可参见本申请相应实施例所提供的方法。
本领域内的技术人员应明白,术语计算机设备涵盖任何适合类型的能够执行计算机程序的设备,例如移动电话、便携数据处理装置、便携网络浏览器或车载移动台。
一般来说,本申请的多种实施例可以在硬件或专用电路、软件、逻辑或其任何组合中实现。例如,一些方面可以被实现在硬件中,而其它方面可以被实现在可以被控制器、微处理器或其它计算装置执行的固件或软件中,尽管本申请不限于此。
本申请的实施例可以通过移动装置的数据处理器执行计算机程序指令来实现,例如在处理器实体中,或者通过硬件,或者通过软件和硬件的组合。计算机程序指令可以是汇编指令、指令集架构(Instruction Set Architecture,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码。
本申请附图中的任何逻辑流程的框图可以表示程序步骤,或者可以表示相互连接的逻辑电路、模块和功能,或者可以表示程序步骤与逻辑电路、模块和功能的组合。计算机程序可以存储在存储器上。存储器可以具有任何适合于本地技术环境的类型并且可以使用任何适合的数据存储技术实现,例如但不限于只读存储器(Read-Only Memory,ROM)、随机访问存储器(Random Access Memory,RAM)、光存储器装置和系统(数码多功能光碟(Digital Video Disk,DVD)或光盘(Compact Disc,CD))等。计算机可读介质可以包括非瞬时性存储介质。数据处理器可以是任何适合于本地技术环境的类型,例如但不限于通用计算机、专用计算机、微处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑器件(Field Programmable Gate Array,FPGA)以及基于多核处理器架构的处理器。

Claims (20)

  1. 一种拓扑图生成方法,包括:
    获取预设事件流,其中,所述预设事件流对应于正常的日志执行路径;
    确定所述预设事件流中的依赖事件对;
    确定所述依赖事件对对应的转移间隔范围,其中,所述转移间隔表示所述依赖事件对中的两个事件相邻发生的时间差;
    根据所述依赖事件对对应的转移概率和所述转移间隔范围生成事件拓扑图,其中,所述转移概率表示所述依赖事件对中的两个事件之间的条件概率。
  2. 根据权利要求1所述的方法,其中,所述事件拓扑图中包含多个节点,所述多个节点中的节点表示所述依赖事件对中的事件,所述多个节点中的两个节点之间的连接关系中包含所述两个节点所代表的依赖事件对对应的转移概率和转移间隔范围。
  3. 根据权利要求2所述的方法,其中,所述根据所述依赖事件对对应的转移概率和所述转移间隔范围生成事件拓扑图,包括:
    以所述依赖事件对中包含的事件为节点,以所述依赖事件对对应的转移概率为节点之间的边的权重,生成最大生成树;
    在所述最大生成树中的边上添加所述依赖事件对对应的转移间隔范围,得到事件拓扑图。
  4. 根据权利要求1所述的方法,其中,所述获取预设事件流,包括:
    获取预设日志流,其中,所述预设日志流对应于所述正常的日志执行路径;
    利用预设日志解析算法对所述预设日志流中的日志进行解析,得到多个日志模板,其中,每个日志模板对应一个事件;
    依据所述多个日志模板将所述预设日志流转化为所述预设日志流对应的预设事件流。
  5. 根据权利要求1所述的方法,其中,所述确定所述预设事件流中的依赖事件对,包括:
    确定所述预设事件流中的每个事件对应的候选后继事件集合,并判断所述每个事件与所述每个事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系,将满足所述第一预设依赖关系的候选后继事件确定为后继事件并将所述后继事件加入后继事件集合,其中,所述每个事件与一个后继事件形成一个依赖事件对。
  6. 根据权利要求5所述的方法,其中,所述确定所述预设事件流中的每个事 件对应的候选后继事件集合,包括:
    将在所述预设事件流中每个事件每两次相邻出现之间存在的第一事件加入初始候选后继事件集合;
    计算所述每个事件和每个第一事件的条件概率;
    将第二事件从所述初始候选后继事件集合中去除,得到所述每个事件对应的候选后继事件集合,其中,所述每个事件和所述第二事件的条件概率小于预设条件概率阈值。
  7. 根据权利要求5所述的方法,其中,所述判断每个事件与所述每个事件所对应的候选后继事件集合中的每个候选后继事件之间是否满足第一预设依赖关系,包括:
    计算每个事件对应的候选后继事件集合中的每个候选后继事件的等待时间的无条件分布,计算所述每个候选后继事件相对于所述每个事件的等待时间的条件分布,根据所述无条件分布和所述条件分布确定所述每个事件与所述每个候选后继事件之间是否满足所述第一预设依赖关系,其中,所述等待时间表示所述每个事件的发生时间与所述每个候选后继事件的发生时间的时间差。
  8. 根据权利要求7所述的方法,其中,所述根据所述无条件分布和所述条件分布确定所述每个事件与所述每个候选后继事件之间是否满足所述第一预设依赖关系,包括:
    在所述无条件分布和所述条件分布符合正态分布的情况下,确定所述每个事件与所述每个候选后继事件之间满足所述第一预设依赖关系。
  9. 根据权利要求1所述的方法,其中,所述依赖事件对的数量为多个,所述确定所述依赖事件对对应的转移间隔范围,包括:
    获取在所述预设事件流中每个依赖事件对对应的时间差序列,对所述时间差序列进行聚类,根据簇类中的时间分布确定所述每个依赖事件对对应的转移间隔范围,其中,所述时间差序列中包含所述每个依赖事件对中的两个事件相邻发生的时间差。
  10. 根据权利要求9所述的方法,其中,所述根据簇类中的时间分布确定所述每个依赖事件对对应的转移间隔范围,包括:
    根据所述簇类中的最大值和最小值确定所述每个依赖事件对对应的转移间隔范围;或者,
    根据所述簇类中的时间分布的置信区间确定所述每个依赖事件对对应的转移间隔范围。
  11. 根据权利要求9所述的方法,其中,所述根据簇类中的时间分布确定所述每个依赖事件对对应的转移间隔范围,包括:
    利用预设统计检验方法对所述每个依赖事件对进行检验,在检验通过的情况下,根据所述簇类中的时间分布确定所述每个依赖事件对对应的转移间隔范围。
  12. 根据权利要求3所述的方法,在所述生成最大生成树之后,还包括:
    计算所述最大生成树中的任意依赖事件对对应的两个节点存在绕行路径的绕行概率,在所述绕行概率大于预设绕行概率阈值的情况下,补全所述两个节点之间的边。
  13. 根据权利要求12所述的方法,其中,所述补全所述两个节点之间的边,包括:
    将所述两个节点之间的目标绕行路径经过的所有边上的权重之和作为所述两个节点之间的边的权重进行边补全,其中,所述目标绕行路径为所经过的所有边的权重之和最大的路径。
  14. 一种异常检测方法,包括:
    获取待检测事件流,其中,所述待检测事件流对应于待检测的日志执行路径;
    将所述待检测事件流与事件拓扑图进行比对,其中,所述事件拓扑图采用如权利要求1-13任一所述的拓扑图生成方法生成;
    根据比对结果确定所述待检测事件流是否存在异常。
  15. 根据权利要求14所述的方法,其中,所述事件拓扑图中包含多个节点,所述多个节点中的节点表示依赖事件对中的事件,所述多个节点中的两个节点之间的连接关系中包含所述两个节点所代表的依赖事件对对应的转移概率和转移间隔范围;
    所述将所述待检测事件流与事件拓扑图进行比对,根据比对结果确定所述待检测事件流是否存在异常,包括:
    在所述事件拓扑图中查找所述待检测事件流中的每个事件对应的目标事件;
    在所述每个事件的下一个事件未对应于所述目标事件的子节点的情况下,确定所述待检测事件流存在异常。
  16. 根据权利要求15所述的方法,还包括:
    在所述每个事件的下一个事件对应于所述目标事件的第一子节点的情况下,获取所述每个事件与所述下一个事件之间的第一时间间隔;
    获取所述目标事件与所述第一子节点对应的转移间隔范围;
    在所述第一时间间隔未处于所述转移间隔范围内的情况下,确定所述待检测事件流存在异常。
  17. 一种拓扑图生成装置,包括:
    预设事件流获取模块,设置为获取预设事件流,其中,所述预设事件流对应于正常的日志执行路径;
    依赖事件对确定模块,设置为确定所述预设事件流中的依赖事件对;
    转移间隔范围确定模块,设置为确定所述依赖事件对对应的转移间隔范围,其中,所述转移间隔表示所述依赖事件对中的两个事件发生时间的时间差;
    拓扑图生成模块,设置为根据所述依赖事件对对应的转移概率和所述转移间隔范围生成事件拓扑图,其中,所述转移概率表示所述依赖事件对中的两个事件之间的条件概率。
  18. 一种异常检测装置,包括:
    待检测事件流获取模块,设置为获取待检测事件流,其中,所述待检测事件流对应于待检测的日志执行路径;
    比对模块,设置为将所述待检测事件流与事件拓扑图进行比对,其中,所述事件拓扑图采用如权利要求1-13任一所述的拓扑图生成方法生成;
    异常检测模块,设置为根据比对结果确定所述待检测事件流是否存在异常。
  19. 一种计算机设备,所述设备包括处理器以及存储器;
    所述处理器设置为执行存储器中存储的程序,以实现权利要求1-16任一项所述的方法。
  20. 一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-16任一项所述的方法。
PCT/CN2020/130033 2019-12-03 2020-11-19 拓扑图生成方法、异常检测方法、装置、设备及存储介质 WO2021109874A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/782,519 US11797360B2 (en) 2019-12-03 2020-11-19 Method for generating topology diagram, anomaly detection method, device, apparatus, and storage medium
EP20896927.9A EP4071616A4 (en) 2019-12-03 2020-11-19 METHOD FOR GENERATING A TOPOLOGY DIAGRAM, ANOMALY DETECTION METHOD, APPARATUS, APPARATUS AND STORAGE MEDIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911222482.4A CN112905370A (zh) 2019-12-03 2019-12-03 拓扑图生成方法、异常检测方法、装置、设备及存储介质
CN201911222482.4 2019-12-03

Publications (1)

Publication Number Publication Date
WO2021109874A1 true WO2021109874A1 (zh) 2021-06-10

Family

ID=76104106

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/130033 WO2021109874A1 (zh) 2019-12-03 2020-11-19 拓扑图生成方法、异常检测方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US11797360B2 (zh)
EP (1) EP4071616A4 (zh)
CN (1) CN112905370A (zh)
WO (1) WO2021109874A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312239B (zh) * 2021-06-11 2024-03-15 腾讯云计算(北京)有限责任公司 一种数据检测方法、装置、电子设备及介质
CN113609631B (zh) * 2021-08-16 2023-11-14 傲林科技有限公司 基于事件网络拓扑图的创建方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630800A (zh) * 2014-10-29 2016-06-01 杭州师范大学 一种节点重要性排序的方法和系统
US20190057138A1 (en) * 2017-08-18 2019-02-21 Vmware, Inc. Presenting a temporal topology graph of a computing environment at a graphical user interface
CN110427299A (zh) * 2019-07-19 2019-11-08 腾讯科技(深圳)有限公司 微服务系统应用的日志处理方法、相关设备及系统
CN110515758A (zh) * 2019-08-27 2019-11-29 北京博睿宏远数据科技股份有限公司 一种故障定位方法、装置、计算机设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9367809B2 (en) * 2013-10-11 2016-06-14 Accenture Global Services Limited Contextual graph matching based anomaly detection
US10417063B2 (en) * 2017-06-28 2019-09-17 Microsoft Technology Licensing, Llc Artificial creation of dominant sequences that are representative of logged events
US10678610B2 (en) * 2018-04-11 2020-06-09 Oracle International Corporation Using and updating topological relationships amongst a set of nodes in event clustering
US20200160189A1 (en) * 2018-11-20 2020-05-21 International Business Machines Corporation System and Method of Discovering Causal Associations Between Events
CN110147387B (zh) * 2019-05-08 2023-06-09 腾讯科技(上海)有限公司 一种根因分析方法、装置、设备及存储介质
US11683237B2 (en) * 2019-06-20 2023-06-20 Koninklijke Philips N.V. Method to enhance system analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630800A (zh) * 2014-10-29 2016-06-01 杭州师范大学 一种节点重要性排序的方法和系统
US20190057138A1 (en) * 2017-08-18 2019-02-21 Vmware, Inc. Presenting a temporal topology graph of a computing environment at a graphical user interface
CN110427299A (zh) * 2019-07-19 2019-11-08 腾讯科技(深圳)有限公司 微服务系统应用的日志处理方法、相关设备及系统
CN110515758A (zh) * 2019-08-27 2019-11-29 北京博睿宏远数据科技股份有限公司 一种故障定位方法、装置、计算机设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4071616A4

Also Published As

Publication number Publication date
US11797360B2 (en) 2023-10-24
EP4071616A4 (en) 2024-01-24
EP4071616A1 (en) 2022-10-12
US20230004451A1 (en) 2023-01-05
CN112905370A (zh) 2021-06-04

Similar Documents

Publication Publication Date Title
CN110351150B (zh) 故障根源确定方法及装置、电子设备和可读存储介质
US9542255B2 (en) Troubleshooting based on log similarity
US8862727B2 (en) Problem determination and diagnosis in shared dynamic clouds
US10127301B2 (en) Method and system for implementing efficient classification and exploration of data
US10693711B1 (en) Real-time event correlation in information networks
US9612898B2 (en) Fault analysis apparatus, fault analysis method, and recording medium
CN107111625B (zh) 实现数据的高效分类和探索的方法和系统
JP5946423B2 (ja) システム・ログの分類方法、プログラム及びシステム
US20110083123A1 (en) Automatically localizing root error through log analysis
CN108491302B (zh) 一种检测spark集群节点状态的方法
US11599539B2 (en) Column lineage and metadata propagation
WO2021109874A1 (zh) 拓扑图生成方法、异常检测方法、装置、设备及存储介质
CN110489317B (zh) 基于工作流的云系统任务运行故障诊断方法与系统
JP6196196B2 (ja) ログ間因果推定装置、システム異常検知装置、ログ分析システム、及びログ分析方法
Xu et al. Efspredictor: Predicting configuration bugs with ensemble feature selection
CN114968727B (zh) 基于人工智能运维的数据库贯穿基础设施的故障定位方法
CN107579944B (zh) 基于人工智能和MapReduce安全攻击预测方法
US10346450B2 (en) Automatic datacenter state summarization
CN114416573A (zh) 一种应用程序的缺陷分析方法、装置、设备及介质
CN112434831A (zh) 故障排查方法、装置、存储介质及计算机设备
CN114139636B (zh) 异常作业处理方法及装置
US20190238400A1 (en) Network element operational status ranking
Wen et al. PerfDoc: Automatic performance bug diagnosis in production cloud computing infrastructures
Bhatia et al. Efficient failure diagnosis of OpenStack using Tempest
CN117520040B (zh) 一种微服务故障根因确定方法、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20896927

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020896927

Country of ref document: EP

Effective date: 20220704