CN117828136A

CN117828136A - Causal weight graph generation method and device and root cause analysis method and device

Info

Publication number: CN117828136A
Application number: CN202410006396.4A
Authority: CN
Inventors: 邱伟航; 张铁华; 赵子铭
Original assignee: Advanced Nova Technology Singapore Holdings Ltd
Current assignee: Advanced Nova Technology Singapore Holdings Ltd
Priority date: 2024-01-03
Filing date: 2024-01-03
Publication date: 2024-04-05

Abstract

The embodiment of the specification provides a causal weight graph generation method and device and a root cause analysis method and device. In the causal weight graph generation method, multi-mode data for calling between services can be acquired; constructing a service call graph according to the multi-mode data; mining implicit causal relations among nodes based on the service call graph; obtaining the edge weight corresponding to each directed edge according to each directed edge in the service call graph and the point information of each node; and constructing a causal weight graph based on the service call graph, the implicit causal relationship and the edge weight.

Description

Causal weight graph generation method and device and root cause analysis method and device

Technical Field

The embodiment of the specification relates to the technical field of artificial intelligence, in particular to a causal weight graph generation method and device and a root cause analysis method and device.

Background

In dealing with computer processing, it is common to involve calls between a large number of services, even scattered among services under different architectures. Based on this, the service is generally faced with more services to be invoked, more configurations, and long links. Therefore, how to find anomalies by root cause analysis methods is a highly desirable problem.

Disclosure of Invention

In view of the foregoing, the embodiments of the present disclosure provide a causal weight graph generating method and apparatus, and a root cause analyzing method and apparatus. Through the technical scheme of the embodiment of the specification, the implicit causal relationship between the services can be mined to generate the causal weight graph for representing the causal relationship between the services, and the anomaly can be found more accurately based on the root cause analysis method of the causal weight graph.

According to an aspect of embodiments of the present specification, there is provided a method for generating a causal weight map, comprising: acquiring multi-mode data aiming at calling among services, wherein the multi-mode data comprises link data; constructing a service call graph according to the multi-mode data, wherein the service call graph is composed of nodes for representing various services and directed edges for representing call relations, and each node corresponds to point information obtained by the multi-mode data; mining implicit causal relations among nodes based on the service call graph; obtaining edge weights corresponding to the directed edges according to the directed edges in the service call graph and the point information of the nodes; and constructing a causal weight graph based on the service call graph, the implicit causal relationship, and the edge weight.

According to another aspect of embodiments of the present specification, there is also provided a method for root cause analysis, wherein the causal weight map used is obtained according to any of the methods for generating causal weight maps described above, the method comprising: screening out subgraphs from the causal weight graph according to the data to be analyzed; and obtaining a root cause analysis result based on the subgraph by using a PageRank algorithm.

According to another aspect of embodiments of the present specification, there is also provided an apparatus for generating a causal weight map, comprising: the data acquisition unit acquires multi-mode data aiming at calling among services, wherein the multi-mode data comprises link data; a service call graph construction unit for constructing a service call graph according to the multi-mode data, wherein the service call graph is composed of nodes for representing various services and directed edges for representing call relations, and each node corresponds to point information obtained by the multi-mode data; the relation mining unit is used for mining the implicit causal relation between the nodes based on the service call graph; the edge weight obtaining unit obtains the edge weight corresponding to each directed edge according to the directed edge in the service call graph and the point information of each node; and a causal weight map construction unit that constructs a causal weight map based on the service call map, the implicit causal relationship, and the edge weight.

According to another aspect of embodiments of the present specification, there is also provided an apparatus for root cause analysis, wherein a causal weight map used is obtained according to any of the methods for generating causal weight maps described above, the apparatus comprising: the subgraph screening unit screens subgraphs from the causal weight graph according to the data to be analyzed; and a root cause analysis unit for obtaining a root cause analysis result based on the subgraph by using PageRank algorithm.

According to another aspect of the embodiments of the present specification, there is also provided an electronic device including: at least one processor, a memory coupled with the at least one processor, and a computer program stored on the memory, the at least one processor executing the computer program to implement a method for generating a causal weight map or for root cause analysis as in any of the above.

Drawings

A further understanding of the nature and advantages of the embodiments herein may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.

FIG. 1 illustrates a flow chart of one example of a causal weight map generation method provided in accordance with one embodiment of the present description.

FIG. 2 illustrates a schematic diagram of one example of a service invocation graph, according to one embodiment of the present description.

FIG. 3 illustrates a flowchart of one example for mining implicit causal relationships according to one embodiment of the present description.

FIG. 4 illustrates a schematic diagram of one example of determining the direction of an undirected causal edge using a D separation method according to one embodiment of the present description.

FIG. 5 illustrates a schematic diagram of one example of an implicit causal graph according to one embodiment of the present description.

FIG. 6 illustrates a schematic diagram of one example of a causal weight graph according to one embodiment of the present description.

Fig. 7 shows a flowchart of an example of a root cause analysis method provided according to another embodiment of the present specification.

Fig. 8 shows a schematic diagram of an example of a sub-graph according to another embodiment of the present description.

Fig. 9 shows a block diagram of one example of a causal weight map generating apparatus according to an embodiment of the present specification.

Fig. 10 shows a block diagram of one example of a root cause analysis apparatus according to an embodiment of the present specification.

FIG. 11 shows a block diagram of an electronic device for implementing the causal weight map generation method of an embodiment of the present specification.

Fig. 12 shows a block diagram of an electronic device for implementing the root cause analysis method according to an embodiment of the present specification.

Detailed Description

The subject matter described herein will be discussed below with reference to example embodiments. It should be appreciated that these embodiments are discussed only to enable a person skilled in the art to better understand and thereby practice the subject matter described herein, and are not limiting of the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the embodiments herein. Various examples may omit, replace, or add various procedures or components as desired. In addition, features described with respect to some examples may be combined in other examples as well.

As used herein, the term "comprising" and variations thereof mean open-ended terms, meaning "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment. The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. Unless the context clearly indicates otherwise, the definition of a term is consistent throughout this specification.

In view of the foregoing, the embodiments of the present disclosure provide a causal weight graph generating method and apparatus, and a root cause analyzing method and apparatus. In the causal weight map generation method,

through the technical scheme of the embodiment of the specification, the implicit causal relationship between the services can be mined to generate the causal weight graph for representing the causal relationship between the services, and the anomaly can be found more accurately based on the root cause analysis method of the causal weight graph.

The following provides a detailed description of a causal weight graph generating method and device, and a root cause analyzing method and device according to the embodiments of the present specification with reference to the accompanying drawings.

The causal weight graph generation method and device and the root cause analysis method and device provided by the embodiment of the specification can be applied to a scenario involving call among a plurality of services, for example, a test environment (Testing Environment). The test environment is a description of the software and hardware environment on which the test is run, as well as other software that interacts with the software under test, including drivers and stub. In this specification, invoked services may also be understood as instances, applications, etc.

In addition, the causal weight graph generation method and apparatus, and the root cause analysis method and apparatus provided in the embodiments of the present disclosure may be in a micro-service architecture, where each function is dispersed into a discrete service to implement decoupling of solutions.

FIG. 1 illustrates a flow chart of one example 100 of a causal weight map generation method provided in accordance with one embodiment of the present description.

As shown in FIG. 1, at 110, multimodal data for calls between services may be obtained.

In embodiments of the present description, multimodal data may be generated as a result of calls between services, and may include link data (Trace). In addition, the multimodal data may also include Log data (Log) and/or Metric data (Metric), and the like. In the link data, a tracking link may be formed, which may be used to embody a calling relationship between services.

At 120, a service invocation graph may be constructed from the multimodal data.

The service call graph may be composed of nodes, each of which may be used to represent one service, and directed edges, which may be used to represent the call relationship between two services characterized by two connected nodes, which may actually have occurred in the multimodal data, so that the call relationship represented by the directed edges in the service call graph may be considered as a strong causal relationship between the two nodes.

The directed edges in the service invocation graph may include a single directed edge and a double directed edge. For example, a directed edge being directed by a connected node to another node may indicate that the node may invoke the other node. Two services represented by two nodes connected by the double directed edge can be called mutually. FIG. 2 illustrates a schematic diagram of one example of a service invocation graph, according to one embodiment of the present description.

In the service call graph, each node may have corresponding point information, the point information may be used as node characteristics of the corresponding node, and the point information of each node may be obtained from multi-mode data. In one example, each service may generate corresponding log data and/or index data among log data and/or index data included in the multimodal data, such that the log data and/or index data may serve as point information of a node corresponding to the service.

In one construction, a service call graph may be constructed based on link data in multiple modalities. In the link data, the respective services constitute a link due to the inter-call relationship. Based on this, each service can be represented by a node, the call relationship between the services is represented by a directed edge, and finally, a service call graph is constructed by each node and each directed edge. After the service call graph is constructed, log data and/or index data included in the multi-modal data can be used for enriching point information of each node so as to increase node characteristics of each node.

In one example, the constructed service call graph may be updated. When new link data is acquired, the new link data may include different services and/or different call relationships, so that the service call graph may be updated according to the new link data to add new services and/or call relationships.

At 130, implicit causal relationships between nodes may be mined based on the service call graph.

The call relationships represented by the directed edges in the service call graph may be considered as explicit causal relationships, which are not explicitly shown with the directed edges in the service call graph. Implicit causal relationships may include callable relationships between two services, but no actual call between the two services occurs and cannot be represented in the service call graph, making the callable relationships implicit.

In the embodiment of the specification, the association between the nodes can be determined based on the point information of the nodes from the data level, so that the implicit causal relationship is mined.

FIG. 3 illustrates a flowchart of one example 300 for mining implicit causal relationships according to one embodiment of the present description.

As shown in fig. 3, at 131, undirected causal edges between the nodes in the service invocation graph may be determined from the point information of the nodes.

In this example, the relevance between nodes, which may represent the dependency between services, may be calculated from the point information of the respective nodes, which may be represented by undirected causal edges. When there is an association between two services, it may be indicated that one of the services can invoke the other service, or that the two services can invoke each other.

In one example, a PC algorithm may be utilized to determine undirected causal edges between the nodes based on the point information of the nodes. When the PC algorithm is applied, a causal Markov condition can be preset and satisfied, and the causal Markov condition can distinguish causal relation and correlation, so that the PC algorithm can be used for determining the strong and weak dependency relationship between services.

In one example, the undirected causal edges between the respective nodes may be mined from the point information of the respective nodes in a conditional cross entropy manner. The measurement of the conditional cross entropy can be used for determining the dependency relationship between services, and the measurement of the conditional cross entropy between any two nodes can be calculated for determining whether the dependency relationship exists between the two nodes. Measurement of conditional Cross entropy G ² The calculation can be made with the following formula:

wherein x and y represent two nodes of the measure of the conditional cross entropy to be solved respectively; z represents a node as a hidden variable, which may be any other node in the service invocation graph, and which may have explicit causal relationships with nodes x and y, respectively, such that the dependency relationship between x and y can be affected. P (x, y|z) represents the distribution of x and y under the conditions of participation of the service characterized by z, and P (x, y|z) can be derived from the point information of nodes x, y and z. For example, x is a front-end service, y is a back-end service, and z is an intermediate service. In this case, P (x, y|z) may represent the probability that the x service invokes the y service with the z service engaged. p (x|z) represents the distribution of x under the conditions of participation of the service characterized by z, and p (x|z) can be obtained from the point information of nodes x and z; p (y|z) represents the distribution of y under the conditions of participation of the service characterized by z, and p (y|z) can be obtained from the point information of nodes y and z; p (z) represents the probability distribution of z, and P (z) can be obtained from the point information of the node z. By the method for calculating the conditional probability, the dependency relationship among the micro services can be understood, so that more effective root cause analysis can be performed.

After the metric of the conditional cross entropy is calculated according to the point information, whether a dependency relationship exists between the two corresponding nodes can be determined according to the metric. In one determination mode, a metric threshold may be preset, and when the calculated metric is greater than the metric threshold, it may be determined that a dependency relationship exists between the two aimed nodes; otherwise, it may be determined that there is no dependency between the two nodes.

For two nodes, when the dependency relationship between the two nodes is determined, the existence of an undirected causal edge between the two nodes can be further determined.

At 133, directions may be determined for each undirected causal edge based on the directed edges in the service invocation graph. Wherein the direction of the undirected causal edge may include unidirectional and bidirectional.

In one example, a D-Separation (D-Separation) method may be utilized to determine directions for the various undirected causal edges from directed edges in the service invocation graph. The D separation method may be a patterning method that determines whether the variables are independent of conditions.

FIG. 4 illustrates a schematic diagram of one example of determining the direction of an undirected causal edge using a D separation method according to one embodiment of the present description. As shown in FIG. 4, the solid line is used to represent dominant causality and the dashed line is used to represent a hidden undirected causal edge. Since node X can call node Y and node Y can call node Z, it can be inferred that the direction of the undirected causal edge between node X and node Z points from X to Z.

At 135, implicit causal relationships may be derived based on the undirected causal edges and the corresponding directions.

Each undirected causal edge can correspondingly determine a direction, so that each undirected causal edge and the corresponding direction can obtain a hidden causal relationship, each hidden causal relationship can be represented by a hidden causal edge, and each hidden causal edge can be correspondingly connected with two nodes for representing the hidden causal relationship between the two nodes.

After all the implicit causal relationships are obtained, a implicit causal relationship graph can be constructed according to the implicit causal relationships serving as implicit causal edges and all the nodes in the service call graph, and the implicit causal relationship graph consists of all the nodes and the implicit causal edges. FIG. 5 illustrates a schematic diagram of one example of an implicit causal graph according to one embodiment of the present description. As shown in FIG. 5, the dashed lines are used to represent the implicit causal edge.

Returning to fig. 1, at 140, edge weights corresponding to each directed edge may be obtained from the directed edges and the point information of each node in the service invocation graph.

Each directed edge may obtain an edge weight, and the edge weight of each directed edge may represent a degree of association or dependency between two services characterized by two nodes to which the directed edge is connected, or a degree of importance of one of the services to the other. The greater the edge weight, the higher the degree of dependence.

For each directed edge, the edge weight of the directed edge may be derived based on node characteristics of two nodes to which the directed edge is connected, which may include node point information.

In one example, the graph attention network model (GAT, graph Attention Networks) may be utilized to derive edge weights corresponding to each directed edge from the directed edges in the service invocation graph and the point information for each node.

The graph Attention network model uses Self Attention mechanism graph neural network, the network uses a similar mode of Self Attention in a transformer to calculate the Attention of a certain node in the graph relative to each adjacent node, the characteristic and the Attention characteristic of the node are fused together to be the node characteristic of the node, and the edge weight is calculated on the basis.

In this example, the edge weights may be calculated using the following Attention score formula:

wherein e _ij Representing the attention factor, a (·) representing self attention calculation,and->And respectively representing node characteristics corresponding to the two nodes, wherein W represents a matrix for mapping the node characteristics to another new space. Attention coefficient e _ij Can be used as an edge weight to characterize the degree of association between two nodes.

After obtaining the edge weights of the directed edges, determining the outgoing edge of the directed edge of each node (i.e. the directed edge which is pointed to other nodes by the node), carrying out normalization processing on the determined edge weights of the outgoing edges, wherein the weight values after normalization processing can be used as the edge weights.

Note that, the order of execution of the operations between 130 and 140 may not be limited, and the operations of 130 and 140 may be performed first, the operations of 140 and 130 may be performed first, and the operations of 130 and 140 may be performed simultaneously.

At 150, a causal weight graph may be constructed based on the service call graph, implicit causal relationships, and edge weights.

In one example, implicit causal relationships and edge weights may be added to the service call graph, resulting in a causal weight graph. The causal weight graph includes each node, a directed edge and an edge weight of the directed edge, and a implicit causal edge for representing an implicit causal relationship. FIG. 6 shows the rootSchematic of one example of a causal weight graph according to one embodiment of the present description. As shown in FIG. 6, the solid line represents a directed edge, w ₁₂ And w ₁₃ Representing exemplary edge weights, dashed lines represent implicit causal edges.

In one example, prior to adding the implicit causal relationship to the service call graph, the resulting implicit causal relationship may be screened for compliance with the call rule based on the service.

The calling rule may be a business-level rule, and may be obtained from expert experience. The calling rules may be stored in the form of a rule base. In the calling rule, a calling direction allowed by each service may be included, for example, other services or other service types allowed to be called, and the like.

In this example, each implicit causal relationship may be compared to a call rule based on a service, and when the call direction presented by a certain implicit causal relationship violates the call direction specified by the call rule, the implicit causal relationship violation may be determined, so that the implicit causal relationship may be deleted. And for the hidden causal relationship conforming to the calling rule, the hidden causal relationship can be reserved as the screened hidden causal relationship.

After screening the implicit causal relationship, a causal weight graph can be constructed based on the service call graph, the screened implicit causal relationship and the edge weight.

In one example, each implicit causal relationship may be assigned a value that may be used as an assigned edge weight for each implicit causal relationship, such that each implicit causal relationship may correspond to an assigned edge weight. The assigned edge weights of the implicit causal relationships can be the same or different.

In one manner of assignment, the assigned edge weights for each implicit causal relationship may be empirically determined. In another manner of assignment, a domain range may be determined that includes a number of edges representing causal relationships, such as edges formed by calls between two services, each edge having an edge weight. Therefore, the average value of the edge weights of the edges in the domain range can be calculated, and the obtained average value can be used as the assigned edge weight of each implicit causal relation. In one way of determining the domain range, a plurality of link data may be selected as the domain range. In one example, the selected link data belongs to the same traffic range. Further, the service range is the same as that of the multi-modal data.

In this example, the implicit causal relationship and the edge weights may be added to the service call graph first, and since the addition of the implicit causal relationship changes the number of edges (including the directed edges and the implicit causal edges) associated with each node in the service call graph, the updating of each edge weight is required.

And updating each outgoing edge and the edge weight corresponding to each target implicit causal relation based on the edge weight corresponding to the outgoing edge in the directed edge connected with the node and the assigned edge weight corresponding to the target implicit causal relation which is called by the node in the implicit causal relation.

Specifically, for each node, an outgoing edge (i.e., a directed edge from the node as a starting point) in the directed edge connected with the node and a target implicit causal relationship obtained by the node call in the implicit causal relationship can be screened from the service call graph, where the target implicit causal relationship is used for representing a call relationship generated by calling other services represented by the node.

Then, the edge weights can be updated based on the edge weights corresponding to the screened out edges and assigned edge weights corresponding to the target implicit causal relationship. In one example, the edge weights of the screened out edges and the assigned edge weights of the target implicit causal relationship may be normalized, and the normalized weights may be used as the edge weights corresponding to each out edge and each target implicit causal relationship.

After the updated edge weights are obtained, a causal weight graph can be constructed based on the service call graph, the implicit causal relationships, and the updated edge weights.

By the embodiment of the causal weight map generation, the generated causal weight map not only comprises dominant calling relations, but also comprises the mined implicit causal relations. Therefore, when root cause analysis is performed based on the causal weight graph, anomalies can be found more accurately, and the accuracy of root cause analysis is improved.

Fig. 7 shows a flowchart of an example 700 of a root cause analysis method provided according to another embodiment of the present disclosure. The causal weight map used in this embodiment may be obtained according to any of the causal weight map generation methods provided in the present specification.

As shown in FIG. 7, at 710, subgraphs may be screened from the causal weight graph based on the data to be analyzed.

In this embodiment, the data to be analyzed may be any data that needs root cause analysis, such as link data to be analyzed. The data to be analyzed can comprise a plurality of actively called or called services, and can also comprise calling relations among the services. Thus, subgraphs that relate only to individual services in the data to be analyzed can be screened from the causal weight graph based on the data to be analyzed.

At 720, root cause analysis results can be obtained based on the subgraph using the PageRank algorithm.

In one example, using a PageRank algorithm in the subgraph, pageRank values for each node in the subgraph may be obtained, which may be used to characterize the anomaly probability for the node. In the PageRank algorithm, the process is performed in an iterative manner until the PageRank values of the respective nodes are stable. Then, the nodes can be ordered according to the PageRank values, one or more nodes with the biggest PageRank values are selected as abnormal nodes, and the abnormal nodes are output according to root cause analysis results.

In another example, the root cause analysis result may be obtained based on the subgraph and initial PageRank values corresponding to respective nodes in the subgraph using PageRank algorithm.

In this example, an initial PageRank value may be preset for each node in the subgraph, and the initial PageRank value corresponding to each node may be derived from at least one of the following factors: the distance between the node and the root node, the order in which the service corresponding to the node is called by the service corresponding to the parent node of the node, whether the node is an abnormal node, and the like.

In one example, for each node, the greater the distance between the node and the root node, the greater the initial PageRank value the node corresponds to. When the initial PageRank value is affected by a number of factors, the distance between the node and the root node may affect the initial PageRank value as one factor. In one example, the distance between the node and the root node may be factored into the initial PageRank value. For example, when the distance between the node and the root node is 2, then 2 may be taken as part of the initial PageRank value.

In one example, a parent node may associate several child nodes such that the parent node may call each child node separately, and the order in which each child node is called by the parent node (hereinafter called the calling order) may be different. The earlier the service represented by a node is called by the service corresponding to the father node, the smaller the initial PageRank value corresponding to the node is; accordingly, the more the service represented by a node is called by the service corresponding to the parent node, the greater the initial PageRank value corresponding to the node.

In this example, the order of calls may be represented by a numerical value, the smaller the numerical value, the earlier the order of calls is represented; the larger the value, the later the call order. The value used to represent the order of calls may be part of the initial PageRank value. For example, if the parent node calls node 3 and then calls node 4, the value of the calling sequence corresponding to node 3 may be 1, and the value of the calling sequence corresponding to node 4 may be 2.

The calling sequence for each node can be obtained according to the log data of each node. Because the operation data of the corresponding service and the corresponding time stamp can be recorded in the log data, the calling sequence can be determined according to the sequence of the time stamp, and the calling sequence is the front if the time stamp is the front.

In one example, the status of each node may be categorized into normal and abnormal, and may be known from a priori label, or may be known from the operational status of each service during execution. Different values may be assigned to the abnormal node and the normal node, respectively. The assignment of the abnormal node can be larger than that of the normal node, so that the root cause analysis is more prone to finding an abnormal cause from the known abnormal nodes, and the assignment of the abnormal node can greatly improve the root cause analysis efficiency. The assignment of the outlier nodes and the normal nodes may be made as part of the initial PageRank value. For example, an outlier node may be assigned a value of 2 and a normal node may be assigned a value of 1.

In one example, when the initial PageRank value is determined together according to the distance between the node and the root node, the calling order corresponding to the node, and whether the node is abnormal, the initial PageRank value may be formed by the distance between the node and the root node, the value corresponding to the calling order, and the assignment of whether the node is abnormal.

Taking fig. 8 as an example, fig. 8 shows a schematic diagram of an example of a sub-graph according to another embodiment of the present specification. As shown in fig. 8, nodes 6 and 8 are known abnormal nodes, and the other nodes are normal nodes. When the subgraph is processed by the PageRank algorithm, initial PageRank values can be respectively assigned to the node 3, the node 4, the node 6 and the node 8.

For node 3, its distance from root node 1 is 2. The calling sequence of the father node 2 of the node 3 for the child nodes (including the node 3 and the node 4) is to call the node 3 first and then call the node 4. Therefore, the value of the calling order of the node 3 is 1. Since node 3 is a normal node, the corresponding normal node is assigned a value of 1. The value (4) obtained by adding the three is the initial PageRank value of the node 3.

For node 4, its distance from root node 1 is 2. The value of the calling order of node 4 is 2. Since node 4 is a normal node, the corresponding normal node is assigned a value of 1. The value (5) obtained by adding the three is the initial PageRank value of the node 4.

For node 6, its distance from root node 1 is 3. The calling sequence of the father node 3 of the node 6 for the child nodes (including the node 5 and the node 6) is to call the node 5 first and then call the node 6. Thus, the value of the calling order of node 6 is 2. Since node 6 is an outlier node, the corresponding outlier node is assigned a value of 2. The value (7) obtained by adding the three is the initial PageRank value of the node 6.

For node 8, its distance from root node 1 is 3. The calling sequence of the father node 4 of the node 8 for the child nodes (including the node 7 and the node 8) is to call the node 7 first and then call the node 8. Therefore, the value of the calling order of the node 8 is 2. Since node 8 is an outlier node, the corresponding outlier node is assigned a value of 2. The value (7) obtained by adding the three is the initial PageRank value of the node 8.

Fig. 9 shows a block diagram of one example of a causal weight map generation device 900 according to an embodiment of the present description.

As shown in fig. 9, the causal weight map generating apparatus 900 includes: a data acquisition unit 910, a service call graph construction unit 920, a relation mining unit 930, an edge weight obtaining unit 940, and a causal weight graph construction unit 950.

The data acquisition unit 910 may be configured to acquire multi-modal data for call between services, the multi-modal data including link data.

The service call graph construction unit 920 may be configured to construct a service call graph according to multi-modal data, where the service call graph is composed of nodes for representing respective services and directed edges for representing call relations, and each node corresponds to point information obtained from the multi-modal data.

The relationship mining unit 930 may be configured to mine implicit causal relationships between nodes based on the service call graph.

The edge weight obtaining unit 940 may be configured to obtain the edge weight corresponding to each directed edge according to each directed edge in the service call graph and the point information of each node.

The causal weight graph construction unit 950 may be configured to construct a causal weight graph based on the service call graph, the implicit causal relationship, and the edge weight.

In one example, the relationship mining unit 930 may include: the system comprises an undirected causal edge determining module, a direction determining module and a relation obtaining unit. The undirected causal edge determination module may be configured to determine undirected causal edges between the respective nodes in the service invocation graph based on the point information of the respective nodes. The direction determination module may be configured to determine a direction for each undirected causal edge based on the directed edges in the service invocation graph. The relationship derivation unit may be configured to derive the implicit causal relationship based on the undirected causal edges and the corresponding directions.

In one example, the undirected causal edge determination module may be further configured to: and determining undirected causal edges among the nodes according to the point information of the nodes by using a conditional cross entropy mode.

In one example, the direction determination module may be further configured to: and determining the direction for each undirected causal edge according to the directed edge in the service call graph by using the D separation method.

In one example, the edge weight obtaining unit 940 may be further configured to: and obtaining the edge weights corresponding to the directed edges according to the directed edges in the service call graph and the point information of the nodes by using the graph attention network model.

In one example, the causal weight map construction unit 950 may be further configured to: screening the implicit causal relationship conforming to the calling rule from the implicit causal relationship according to the calling rule based on the service; and constructing a causal weight graph based on the service call graph, the screened implicit causal relationship and the edge weight.

In one example, the causal weight map construction unit 950 may include: the side weight updating module and the causal weight graph constructing module. The edge weight updating module may be configured to update, for each node, the edge weights corresponding to the outgoing edge and the target implicit causal relationship obtained by the node call based on the edge weights corresponding to the outgoing edge in the directed edge connected with the node and the assigned edge weights corresponding to the target implicit causal relationship in the implicit causal relationship. The causal weight graph construction module may be configured to construct a causal weight graph based on the service call graph, the implicit causal relationship, and the updated edge weights.

Fig. 10 shows a block diagram of one example of a root cause analysis apparatus 1000 according to an embodiment of the present specification. The causal weight map used in the root cause analysis device 1000 may be obtained according to any of the causal weight map generation methods provided in the present specification.

As shown in fig. 10, the root cause analysis device 1000 includes: a subgraph screening unit 1010 and a root cause analysis unit 1020.

The subgraph screening unit 1010 may be configured to screen subgraphs from the causal weight graph according to the data to be analyzed.

Root cause analysis unit 1020 may be configured to obtain root cause analysis results based on the subgraph using the PageRank algorithm.

In one example, root cause analysis unit 1020 may be further configured to: obtaining a root cause analysis result based on the subgraph and initial PageRank values corresponding to all nodes in the subgraph by using a PageRank algorithm, wherein the initial PageRank value corresponding to each node is obtained according to at least one of the following factors: the distance between the node and the root node, the order in which the service corresponding to the node is called by the service corresponding to the parent node of the node, and whether the node is abnormal.

Embodiments of a causal weight map generation method and apparatus, and a root cause analysis method and apparatus according to embodiments of the present specification are described above with reference to fig. 1 to 10.

The causal weight graph generating device and the root cause analyzing device in the embodiments of the present disclosure may be implemented by hardware, or may be implemented by software or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a memory into a memory by a processor of a device where the device is located. In the embodiment of the present specification, the causal weight map generating means and the root cause analyzing means may be realized by, for example, an electronic device.

FIG. 11 illustrates a block diagram of an electronic device 1100 for implementing a causal weight map generation method of an embodiment of the present description.

As shown in fig. 11, the electronic device 1100 may include at least one processor 1110, memory (e.g., non-volatile memory) 1120, memory 1130, and a communication interface 1140, and the at least one processor 1110, memory 1120, memory 1130, and communication interface 1140 are connected together via a bus 1150. At least one processor 1110 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in memory that, when executed, cause at least one processor 1110 to: acquiring multi-mode data aiming at calling among services; constructing a service call graph according to the multi-mode data; mining implicit causal relations among nodes based on the service call graph; obtaining the edge weight corresponding to each directed edge according to each directed edge in the service call graph and the point information of each node; and constructing a causal weight graph based on the service call graph, the implicit causal relationship and the edge weight.

It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1110 to perform the various operations and functions described above in connection with fig. 1-10 in various embodiments of the present specification.

Fig. 12 shows a block diagram of an electronic device 1200 for implementing the root cause analysis method according to an embodiment of the present disclosure.

As shown in fig. 12, the electronic device 1200 may include at least one processor 1210, memory (e.g., non-volatile memory) 1220, memory 1230, and a communication interface 1240, with the at least one processor 1210, memory 1220, memory 1230, and communication interface 1240 being connected together via a bus 1250. The at least one processor 1210 executes at least one computer readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in memory that, when executed, cause the at least one processor 1210 to: screening out subgraphs from the causal weight graph according to the data to be analyzed; and obtaining root cause analysis results based on the subgraph by using PageRank algorithm.

It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1210 to perform the various operations and functions described above in connection with fig. 1-10 in various embodiments of the present description.

According to one embodiment, a program product, such as a machine-readable medium, is provided. The machine-readable medium may have instructions (i.e., elements described above implemented in software) that, when executed by a machine, cause the machine to perform the various operations and functions described above in connection with fig. 1-10 in various embodiments of the specification.

In particular, a system or apparatus provided with a readable storage medium having stored thereon software program code implementing the functions of any of the above embodiments may be provided, and a computer or processor of the system or apparatus may be caused to read out and execute instructions stored in the readable storage medium.

In this case, the program code itself read from the readable medium may implement the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.

Computer program code required for operation of portions of the present description may be written in any one or more programming languages, including an object oriented programming language such as Java, scala, smalltalk, eiffel, JADE, emerald, C ++, c#, VB, NET, python and the like, a conventional programming language such as C language, visual Basic 2003, perl, COBOL 2002, PHP and ABAP, a dynamic programming language such as Python, ruby and Groovy, or other programming languages and the like. The program code may execute on the user's computer or as a stand-alone software package, or it may execute partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any form of network, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or the connection may be made to the cloud computing environment, or for use as a service, such as software as a service (SaaS).

Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or cloud by a communications network.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

Not all steps or units in the above-mentioned flowcharts and system configuration diagrams are necessary, and some steps or units may be omitted according to actual needs. The order of execution of the steps is not fixed and may be determined as desired. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented jointly by some components in multiple independent devices.

The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.

The alternative implementation manner of the embodiment of the present disclosure has been described in detail above with reference to the accompanying drawings, but the embodiment of the present disclosure is not limited to the specific details of the foregoing implementation manner, and various simple modifications may be made to the technical solution of the embodiment of the present disclosure within the scope of the technical concept of the embodiment of the present disclosure, and all the simple modifications belong to the protection scope of the embodiment of the present disclosure.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for generating a causal weight map, comprising:

acquiring multi-mode data aiming at calling among services, wherein the multi-mode data comprises link data;

constructing a service call graph according to the multi-mode data, wherein the service call graph is composed of nodes for representing various services and directed edges for representing call relations, and each node corresponds to point information obtained by the multi-mode data;

mining implicit causal relations among nodes based on the service call graph;

obtaining edge weights corresponding to the directed edges according to the directed edges in the service call graph and the point information of the nodes; and

and constructing a causal weight graph based on the service call graph, the implicit causal relation and the edge weight.

2. The method of claim 1, wherein the multimodal data further comprises log data and/or index data.

3. The method of claim 1 or 2, wherein mining implicit causal relationships between nodes based on the service call graph comprises:

determining undirected causal edges among all nodes according to the point information of all nodes in the service call graph;

Determining a direction for each undirected causal edge according to the directed edges in the service call graph; and

and obtaining the implicit causal relationship based on the undirected causal edge and the corresponding direction.

4. The method of claim 3, wherein determining the undirected causal edges between the nodes in the service invocation graph based on the point information for the nodes comprises:

and determining undirected causal edges among the nodes according to the point information of the nodes by using a conditional cross entropy mode.

5. The method of claim 3, wherein determining directions for each undirected causal edge based on directed edges in the service invocation graph comprises:

and determining the direction for each undirected causal edge according to the directed edges in the service call graph by using a D separation method.

6. The method of claim 1 or 2, wherein obtaining the edge weights corresponding to the directed edges according to the directed edges in the service call graph and the point information of the nodes comprises:

and obtaining the edge weights corresponding to the directed edges according to the directed edges in the service call graph and the point information of the nodes by using a graph attention network model.

7. The method of claim 1 or 2, wherein constructing a causal weight graph based on the service call graph, the implicit causal relationship, and the edge weight comprises:

Screening the implicit causal relationship conforming to the call rule from the implicit causal relationship according to the call rule based on the service; and

and constructing a causal weight graph based on the service call graph, the screened implicit causal relationship and the edge weight.

8. The method of claim 1 or 2, wherein constructing a causal weight graph based on the service call graph, the implicit causal relationship, and the edge weight comprises:

updating each outgoing edge and the edge weight corresponding to each target implicit causal relation based on the edge weight corresponding to the outgoing edge in the directed edge connected with the node and the assigned edge weight corresponding to the target implicit causal relation which is called by the node in the implicit causal relation; and

and constructing a causal weight graph based on the service call graph, the implicit causal relationship and the updated edge weight.

9. A method for root cause analysis, wherein the causal weighting map used is obtained according to the method of any of claims 1-8,

the method comprises the following steps:

screening out subgraphs from the causal weight graph according to the data to be analyzed; and

and obtaining a root cause analysis result based on the subgraph by using a PageRank algorithm.

10. The method of claim 9, wherein deriving root cause analysis results based on the subgraph using a PageRank algorithm comprises:

obtaining a root cause analysis result based on the subgraph and initial PageRank values corresponding to all nodes in the subgraph by using a PageRank algorithm, wherein the initial PageRank value corresponding to each node is obtained according to at least one of the following factors: the distance between the node and the root node, the order in which the service corresponding to the node is called by the service corresponding to the parent node of the node, and whether the node is abnormal.

11. An apparatus for generating a causal weight map, comprising:

the data acquisition unit acquires multi-mode data aiming at calling among services, wherein the multi-mode data comprises link data;

a service call graph construction unit for constructing a service call graph according to the multi-mode data, wherein the service call graph is composed of nodes for representing various services and directed edges for representing call relations, and each node corresponds to point information obtained by the multi-mode data;

the relation mining unit is used for mining the implicit causal relation between the nodes based on the service call graph;

the edge weight obtaining unit obtains the edge weight corresponding to each directed edge according to the directed edge in the service call graph and the point information of each node; and

And the causal weight map construction unit is used for constructing a causal weight map based on the service call map, the implicit causal relation and the edge weight.

12. The apparatus of claim 11, wherein the relationship mining unit comprises:

the undirected causal edge determining module is used for determining undirected causal edges among the nodes according to the point information of the nodes in the service call graph;

the direction determining module is used for determining a direction for each undirected causal edge according to the directed edges in the service call graph; and

and the relation obtaining unit is used for obtaining the implicit causal relation based on the undirected causal edge and the corresponding direction.

13. The apparatus of claim 11, wherein the causal weight map construction unit comprises:

the edge weight updating module is used for updating the edge weights corresponding to the outgoing edges and the target implicit causal relationships in the implicit causal relationships based on the edge weights corresponding to the outgoing edges in the directed edges connected with the nodes and the assigned edge weights corresponding to the target implicit causal relationships which are called by the nodes; and

and the causal weight map construction module is used for constructing a causal weight map based on the service call map, the implicit causal relationship and the updated edge weight.

14. An apparatus for root cause analysis, wherein a causal weighting map used is obtained according to the method of any of claims 1-8, said apparatus comprising:

the subgraph screening unit screens subgraphs from the causal weight graph according to the data to be analyzed; and

and the root cause analysis unit is used for obtaining a root cause analysis result based on the subgraph by using a PageRank algorithm.

15. An electronic device, comprising: at least one processor, a memory coupled with the at least one processor, and a computer program stored on the memory, the at least one processor executing the computer program to implement the method of any of claims 1-10.