WO2021056197A1 - 根本原因分析方法、装置、电子设备、介质以及程序产品 - Google Patents

根本原因分析方法、装置、电子设备、介质以及程序产品 Download PDF

Info

Publication number
WO2021056197A1
WO2021056197A1 PCT/CN2019/107571 CN2019107571W WO2021056197A1 WO 2021056197 A1 WO2021056197 A1 WO 2021056197A1 CN 2019107571 W CN2019107571 W CN 2019107571W WO 2021056197 A1 WO2021056197 A1 WO 2021056197A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
abnormal
propagation
nodes
graph
Prior art date
Application number
PCT/CN2019/107571
Other languages
English (en)
French (fr)
Inventor
王冬
Original Assignee
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子(中国)有限公司 filed Critical 西门子(中国)有限公司
Priority to PCT/CN2019/107571 priority Critical patent/WO2021056197A1/zh
Priority to CN201980100087.0A priority patent/CN114341877A/zh
Publication of WO2021056197A1 publication Critical patent/WO2021056197A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models

Definitions

  • the present disclosure generally relates to the field of industrial technology, and more specifically, to root cause analysis methods, devices, electronic equipment, media, and program products
  • KPIs key performance indicators
  • Root cause analysis of abnormal KPI or performance degradation is a common scenario for KPI analysis. Especially for the manufacturing process, abnormal KPIs usually indicate the potential risk of economic loss in the event of failure.
  • the root cause analysis is a very challenging task due to the following reasons:
  • the cloud platform can collect data from different silos and visualize it.
  • the cloud itself cannot link data semantically, and a large number of extraction transformation loading (ETL) tasks need to be performed before analysis. It is hoped that there can be a solution that can embed domain knowledge in the transmission chain and integrate data semantically.
  • ETL extraction transformation loading
  • Root cause analysis can be divided into two main types: data-based and derivation-based. Root cause analysis has a wide range of applications, not just limited to factory process control, computer system performance, program debugging, wireless networks, etc.
  • Reference 1 (WO2017118380A1, "Fingerprint Recognition Root Cause Analysis in Cellular System”) involves learning rules based on historical performance data to characterize associations between indicators, monitoring abnormalities and matching rules.
  • Reference 3 (US10210189B2, "Analysis of the Root Cause of Performance Problems") calculates database performance values based on monitored KPIs and database performance output. In order to determine that the database performance value is lower than the threshold, the KPI correlation coefficient and the correlation matrix are generated and used to determine the objective function.
  • a root cause analysis method including: extracting a propagation graph from an annotated knowledge graph, wherein the propagation graph includes abnormal nodes that have abnormal conditions and those that have a propagation relationship with the abnormal nodes. Node; and analyze the root cause of the abnormal situation of the abnormal node based on the attributes of the node in the propagation graph.
  • the method before extracting the propagation graph from the annotated knowledge graph, the method further includes: constructing a knowledge graph based on the ontology model, and assigning at least one pair of nodes in the constructed knowledge graph A propagation attribute is added to the relationship between, wherein the propagation attribute includes two sub-attributes of direction and context, the direction indicates the propagation direction between two nodes, and the context indicates the scenario where root cause analysis is to be performed.
  • extracting a propagation graph from the annotated knowledge graph includes: extracting a propagation graph related to the abnormal node from the annotated knowledge graph by using a query sentence.
  • the propagation graph further includes a propagation direction between nodes.
  • analyzing the root cause of the abnormal situation of the abnormal node based on the attributes of the nodes in the propagation graph includes:
  • the weight of the node is determined based on at least one of the following items: the number of paths from the node to the abnormal node, and the distance from the node to the abnormal node And whether the node is an intermediary node.
  • calculating the abnormal probability of the node based on at least one factor that affects the state of the node includes: based on the at least one factor that affects the state of the node, the dominant factor and the recessive factor are respectively relative to each other. The influence degree of the abnormal situation and the moving average difference of each factor are described to calculate the abnormal probability of the node.
  • a root cause analysis device including: a propagation graph extraction unit configured to extract a propagation graph from an annotated knowledge graph, wherein the propagation graph includes abnormal nodes with abnormal conditions and A node that has a propagation relationship with the abnormal node; and an analysis unit configured to analyze the root cause of the abnormal situation of the abnormal node based on the attributes of the nodes in the propagation graph.
  • the device further includes: a knowledge graph annotation unit configured to construct a knowledge graph based on the ontology model, and add the relationship between at least a pair of nodes in the constructed knowledge graph
  • An annotation attribute wherein the annotation attribute includes two sub-attributes of direction and context, where the direction represents the propagation direction between two nodes, and the context represents the scenario where root cause analysis is to be performed.
  • the propagation graph extraction unit is further configured to extract a propagation graph related to the abnormal node from the annotated knowledge graph by using a query sentence.
  • the propagation graph further includes a propagation direction between nodes.
  • the analysis unit is further configured to:
  • the node If the abnormal probability is greater than the predetermined threshold, the node is considered to be the root cause of the abnormal situation, and the operation is stopped;
  • the weight of the node is determined based on at least one of the following items: the number of paths from the node to the abnormal node, and the node to the abnormal node , And whether the node is an intermediary node.
  • the analysis unit is further configured to: based on the degree of influence of the dominant factor and the recessive factor on the abnormal situation and the respective factors of the at least one factor that affects the state of the node.
  • the moving average difference is used to calculate the abnormal probability of the node.
  • an electronic device including: at least one processor; and a memory coupled with the at least one processor, the memory is used to store instructions, when the instructions are used by the at least one When the processor executes, the processor is caused to execute the method as described above.
  • a non-transitory machine-readable storage medium which stores executable instructions that, when executed, cause the machine to perform the method as described above.
  • a computer program including computer-executable instructions that, when executed, cause at least one processor to perform the method as described above.
  • a computer program product that is tangibly stored on a computer-readable medium and includes computer-executable instructions that, when executed, cause at least A processor executes the method described above.
  • the knowledge of fault propagation can be embedded into the ontology, making full use of the advantages of the knowledge graph for large-scale data integration, and it only takes a small amount of work to construct an annotated propagation graph .
  • data can be semantically integrated and linked across systems, formats, and locations. This can break through the limits of data volume and type, and achieve more complex data applications and analysis tasks.
  • a unified integrated template based on the ontology model can be used to ensure the quality of data and solve the problem of data islands.
  • FIG. 1 is a flowchart showing an exemplary process of a root cause analysis method according to an embodiment of the present disclosure
  • FIGS. 2A-2C are schematic diagrams showing different propagation relationships
  • Figure 3 is a schematic diagram of a specific example of a propagation graph
  • Figure 4 is a schematic diagram of another specific example of a propagation diagram
  • Fig. 5 is a schematic diagram of another specific example of a propagation diagram
  • FIG. 6 is a flowchart showing an exemplary process of the operation of block S104 in FIG. 1;
  • FIG. 7 is a block diagram showing an exemplary configuration of a root cause analysis apparatus 700 according to an embodiment of the present disclosure.
  • FIG. 8 shows a block diagram of an electronic device 1000 for root cause analysis according to an embodiment of the present disclosure.
  • the term “including” and its variations mean open terms, meaning “including but not limited to.”
  • the term “based on” means “based at least in part on.”
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment.”
  • the terms “first”, “second”, etc. may refer to different or the same objects. Other definitions can be included below, whether explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.
  • the present disclosure provides a solution for embedding domain knowledge in the propagation chain and integrating data semantically.
  • the method according to the embodiment of the present disclosure can use the assistance of the knowledge base to reduce the dependence on manpower.
  • FIG. 1 is a flowchart showing an exemplary process of a root cause analysis method 100 according to an embodiment of the present disclosure.
  • a propagation graph is extracted from an annotation-added knowledge graph, where the propagation graph includes abnormal nodes that have abnormal conditions and nodes that have a propagation relationship with the abnormal nodes.
  • the propagation graph is a subgraph extracted from an annotated knowledge graph.
  • the propagation graph is assisted by adding annotations in the knowledge graph.
  • a propagation attribute is defined, and a propagation attribute is added to the relationship between at least a pair of nodes in the knowledge graph.
  • the propagation attribute can include two sub-attributes of direction and context, where direction represents the relationship between two nodes.
  • the direction of dissemination and context are used to define the specific scenarios involved in the knowledge graph.
  • Propagation refers to the relationship between two events: the second event occurs after the first event, and the first event takes a certain possibility as the cause of the second event.
  • the propagation relationship between two events can include the following three types.
  • the first is inclusion propagation.
  • Inclusion propagation is the flow from the child to the parent. As shown in FIG. 2A, the flow from the child object 202 to the parent object 201, that is, the child failure will cause the parent failure.
  • a bad tool may have a 90% probability of making CNC (Computerised Numerical Control) machine tools unusable.
  • the inclusion propagation model may be suitable for most hierarchical system structures, such as the ISA-95 enterprise model.
  • the second type is upstream and downstream propagation, which is the flow from upstream to downstream, as shown in Figure 2B, the flow from object 1 203 to object 2 204. That is, an upstream failure causes a downstream failure. For example, the longer processing time of the machine 1 causes the longer waiting time of the machine 2, or the delay of the supplier causes the delay of the product plan.
  • This upstream and downstream propagation mode can be used to describe the relationship between independent events.
  • the third type is the equivalent relationship.
  • object 3 205 and object 4 206 are equivalent.
  • the flow equivalent means that there is no (no need) propagation, that is, if the object 3 has a fault, the object 4 must also have a fault.
  • This model can be used to describe bound events, for example, product quality and quality testing are equivalent relationships (without considering the test error itself).
  • the knowledge graph constructed based on the ontology model is a ternary data set, which includes the relationship between objects and objects, and the relationship here is also directional. But the propagation direction may be different from the general relationship direction defined in the ontology. In other words, communication is a specific type of relationship that is determined by objective facts rather than subjective descriptions. For example, the relationship from A to B "A has part B” is equivalent to the relationship from B to A "B is a part of A”. But for both "components" and “components that are", the propagation direction is always from B to A (contains the propagation relationship of the mode).
  • the added propagation attribute includes the sub-attribute of direction to indicate whether the propagation direction of the relationship between the two nodes is consistent with the direction defined in the ontology.
  • the attribute "propagation” has two sub-attributes: “context” and “direction”. Taking “hasPart (with parts)” as an example, this annotation means that in the context of tracking the abnormality of "CycleTime", the propagation direction is opposite to the direction of the attribute "hasPart”.
  • the added propagation attribute also includes context
  • the context can define the specific scene involved in the knowledge graph.
  • the context can also define the boundaries of the propagation graph.
  • a query sentence (such as SPARQL) can be used to extract the propagation graph from the knowledge graph.
  • SPARQL a query sentence
  • Those skilled in the art can understand the specific operation of extracting the propagation graph related to the abnormal node from the knowledge graph by writing the query statement in the query language, which will not be described in detail here.
  • FIG. 3 shows a schematic diagram of a specific example 300 of the propagation diagram of a production line.
  • Figure 3 includes production line PL, unit 1 U1 to unit n Un, machine tool 1 to machine tool k, and machine tool M.
  • the propagation diagram shown in Figure 3 is modeled as a tree with vertical layers and horizontal connections, which is a combination of containment and upstream and downstream relationships.
  • FIG. 4 is a schematic diagram of a propagation diagram 400 of production management.
  • the operation in block S101 may be performed first: constructing a knowledge graph based on the ontology model, and setting at least one pair of nodes in the constructed knowledge graph Add a propagation attribute to the relationship.
  • the propagation attribute may include two sub-attributes, direction and context, where the direction represents the direction of propagation between two nodes, and the context represents the scenario where root cause analysis is to be performed.
  • a knowledge graph can be constructed in advance, and a propagation attribute can be added to the relationship between at least a pair of nodes in the constructed knowledge graph, and then the annotated knowledge graph can be stored in The medium is used for query, and the operation in block S101 does not need to be executed every time.
  • the operation in block S104 may be performed: based on the attributes of the nodes in the propagation graph, the root cause of the abnormality of the abnormal node is analyzed.
  • the key is to evaluate the possibility of a candidate cause as the root cause.
  • two indicators are proposed to evaluate the node: the weight of the node and the abnormal probability of the node.
  • the weight of a node is used to indicate the importance of a node in the entire propagation graph.
  • the importance of a node can be determined based on, for example, the number of paths from the node to the starting node where the abnormal situation occurs, the distance from the node to the starting node, and whether the node is an intermediate node or not.
  • the following formula (1) can be used to calculate the weight of a node.
  • path_num represents the number of paths from node n to m; dist(n, m) represents the shortest distance from node n to m.
  • the node m represents the starting node (also referred to as the abnormal node) where the abnormal situation begins to occur.
  • the abnormal node For example, in FIG. 4, it is the "cycle time CT" represented by the dashed box.
  • Node n represents any node along the propagation stream.
  • the number of paths is the number of paths from the abnormal node to the node; the shortest distance is the number of nodes that the node passes through to the abnormal node.
  • node weight is not limited to the above formula (1), but any number of paths from the node to the abnormal node, the distance from the node to the abnormal node, and whether the node is an intermediate node can be used. It can be calculated based on functions related to other factors, and the factors considered may not be limited to these types, but may also include any other factors related to the importance of the node, for example, it may be related to the conditional probability of the node mentioned below. This will not be detailed here.
  • Fig. 5 shows a propagation diagram related to the node "Cycle Time CT", where the node “Cycle Time CT” in the dashed frame represents an abnormal node with an abnormal situation.
  • Table 1 below shows the weight of each node in Figure 5.
  • Production Line A LA is defined as equivalent to “Cycle Time CT”, so they are both starting points.
  • Sensor 1 S1 does not have a path to "Production Line A LA” or “Cycle Time CT”, so it should not be within the scope of the cause of the abnormal situation, so it can be seen that Sensor 1 S1 has no weight in Table 1.
  • the abnormal probability of a node is used to indicate the possibility of whether a node is in an abnormal state.
  • the factor determining the possibility that a node is in an abnormal state is used to indicate the possibility of whether a node is in an abnormal state.
  • Dominant factors are those clearly specified in the attributes of the node, such as the input of KPI formulas; the remaining factors can be called hidden factors, such as the temperature and humidity of the environment.
  • a moving average difference can be calculated for each factor.
  • the moving average difference is a measurement of the deviation of a factor’s short-term behavior and long-term behavior. For example, it can be the value of a factor relative to its normal state.
  • the degree of value change, here the length of the moving window can be specified according to the situation.
  • the abnormal probability of the node can be calculated by the following formula (2), for example.
  • pab is the abnormal probability of the node
  • w i is the weight of the factor
  • diff(f i ) is the moving average difference
  • the weight of a factor represents the contribution of the factor to the estimation of the possibility of an abnormal state. For example, dominant factors are generally more important than recessive factors. If there is no prior knowledge about this factor, a default value can be set.
  • the weighted sum of the moving average differences of all factors of a node can be used to calculate the abnormal probability of the node. If the calculated abnormal probability is greater than the predetermined threshold, the node can be considered abnormal, that is, the node may be the root cause node of the abnormal situation.
  • the method of calculating the abnormal probability of a node is not limited to the above formula (2).
  • the following method can also be used for calculation: using different data of various factors of each node as sample data for machine learning , And then use the model obtained by machine learning to calculate the abnormal probability of a node. Therefore, in the method according to the embodiment of the present disclosure, there is no limitation on the specific manner of calculating the abnormal probability of the node based on factors that affect the state of the node.
  • the annotation added to the node may also include probability information of the node, and this probability information may indicate the uncertainty of the causal relationship between the node and its parent node.
  • the spread graph constructed in this way is similar to a Bayesian network.
  • Each node in the propagation graph has a probability information table, which includes the conditional probability between the node and its immediate parent node. It can be understood that in different contexts, a node may have different probability information tables.
  • the probability information of the node can be used as a factor when calculating the weight of the node. If the probability information of the node is not included in the annotation, it is equivalent to a probability of 1 or 0, that is, there is a propagation relationship or no propagation relationship between the two nodes.
  • the weight of the node and the abnormal probability of the node can be determined for each node, and then the root cause analysis can be started.
  • the node weight is mainly used to determine which node to start from and in what order to analyze the node to improve the speed and efficiency of the analysis; the abnormal probability of the node is used to determine whether the node is abnormal root cause.
  • FIG. 6 is a flowchart showing an exemplary process of the operation in block S104.
  • judgment block S1046 determine whether the calculated abnormal probability is greater than the predetermined threshold, if the abnormal probability is greater than the predetermined threshold, indicated as Y in the figure, proceed to the operation in block S1048; otherwise, indicated as N in the figure, then correct The next node performs the operation in block S1044.
  • the predetermined threshold here can be preset by those skilled in the art based on experience.
  • the root cause of the abnormal situation can be determined.
  • the analysis method according to the embodiment of the present disclosure may not find the root cause of the abnormal situation in the propagation graph. In this case, the operation ends after the operation is performed on all nodes.
  • Annotation includes two aspects: direction and context.
  • the domain knowledge should first be decomposed into the smallest elements, which means a very specific and definite description of propagation.
  • Context can limit the boundaries of the propagation graph to reduce complexity.
  • quantitative indicators help the rule to satisfy the evaluation. For example, the distance from the device A to the device B can be used to evaluate the impact level of the device A on the device B.
  • data can be semantically integrated and linked across systems, formats, and locations. This can break through the limits of data volume and type, and achieve more complex data applications and analysis tasks.
  • the knowledge of fault propagation can be embedded into the ontology, making full use of the advantages of the knowledge graph for large-scale data integration, and only a small amount of work is required to construct an annotated propagation graph.
  • the ontology model is a unified template for integration, which can ensure the quality of the data.
  • the construction of the propagation graph is transformed into a query task.
  • the query result can be dynamically adjusted according to the input of the context included in the annotation.
  • Digitization is based on accessible data and system knowledge. Thanks to the Internet of Things technology, data acquisition has broken through the bottleneck, allowing more events to be interconnected.
  • the purpose of digitization is not only to "digitize" itself, but to use data and knowledge to obtain added value.
  • the method according to the present invention provides a good technical solution for obtaining business insights from data.
  • the present invention is not limited to the KPI tracking problem, but proposes a method for evaluating the influence of individuals on the entire network. In the context of the manufacturing industry, it can be: the importance of equipment/materials/people; quality tracking; order delivery analysis, etc. Most of the above problems are very challenging due to complexity and reliance on analytical skills.
  • the reasoning assisted by the map can save manpower and help people use the map to make more accurate judgments.
  • FIG. 7 is a block diagram showing an exemplary configuration of a root cause analysis apparatus 700 according to an embodiment of the present disclosure.
  • the root cause analysis device 700 includes: a propagation map extraction unit 702 and an analysis unit 704.
  • the propagation graph extraction unit 702 is configured to extract a propagation graph from the annotated knowledge graph, where the propagation graph includes abnormal nodes that have abnormal conditions and nodes that have a propagation relationship with the abnormal nodes.
  • the analysis unit 704 is configured to analyze the root cause of the abnormal situation of the abnormal node based on the attribute of the node in the propagation graph.
  • the root cause analysis device 700 may further include a knowledge graph annotation unit 701, configured to construct a knowledge graph based on the ontology model, and add an annotation attribute to the relationship between at least one pair of nodes in the constructed knowledge graph ,
  • the annotation attribute includes two sub-attributes of direction and context
  • the direction represents the propagation direction between two nodes
  • the context defines the specific scene involved in the knowledge graph.
  • the propagation graph extraction unit 702 is further configured to extract a propagation graph related to the abnormal node from the annotated knowledge graph by using a query sentence.
  • the propagation graph also includes the propagation direction between nodes.
  • analysis unit 704 is further configured to:
  • the node If the abnormal probability is greater than the predetermined threshold, the node is considered to be the root cause of the abnormal situation, and the operation is stopped;
  • the weight of the node is determined based on at least one of the following items:
  • the number of paths from the node to the abnormal node, the distance from the node to the abnormal node, and whether the node is an intermediate node is an intermediate node.
  • the analysis unit 704 is further configured to calculate the abnormality probability of the node based on the degree of influence of the dominant factor and the recessive factor on the abnormal situation of the at least one factor that affects the state of the node, and the moving average difference of each factor. .
  • each part of the root cause analysis device 700 may be the same or similar to the relevant parts of the embodiment of the root cause analysis method 100 of the present disclosure described with reference to FIGS. 1-6, and will not be described in detail here.
  • root cause analysis device 700 and its constituent units shown in FIG. 7 is only exemplary, and those skilled in the art can modify the structure block diagram shown in FIG. 7 as needed.
  • the above-mentioned root cause analysis device can be implemented by hardware, or by software or a combination of hardware and software.
  • FIG. 8 shows a block diagram of an electronic device 800 for performing root cause analysis according to an embodiment of the present disclosure.
  • the electronic device 800 may include at least one processor 802, and the processor 802 executes at least one computer-readable instruction stored or encoded in a computer-readable storage medium (ie, the memory 804) (ie, the foregoing is in the form of software). Implemented elements).
  • computer-executable instructions are stored in the memory 804, which, when executed, cause at least one processor 802 to complete the following actions: extract a propagation graph from an annotated knowledge graph, where the propagation graph includes occurrence of abnormalities The abnormal node of the situation and the node that has a propagation relationship with the abnormal node; and analyzing the root cause of the abnormal situation of the abnormal node based on the attributes of the nodes in the propagation graph.
  • a non-transitory machine-readable medium may have machine-executable instructions (that is, the above-mentioned elements implemented in the form of software), which when executed by a machine, cause the machine to execute the various embodiments of the present disclosure in conjunction with FIGS. 1-7.
  • machine-executable instructions that is, the above-mentioned elements implemented in the form of software
  • a computer program including computer-executable instructions, which when executed, cause at least one processor to execute each of the above described in conjunction with FIGS. 1-7 in the various embodiments of the present disclosure.
  • a computer program product including computer-executable instructions, which when executed, cause at least one processor to execute the above described in conjunction with FIGS. 1-7 in the various embodiments of the present disclosure.

Abstract

根本原因分析方法、装置、电子设备、介质以及程序产品。所述根本原因分析方法,包括:从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点(S102);以及基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因(S104)。

Description

根本原因分析方法、装置、电子设备、介质以及程序产品 技术领域
本公开通常涉及工业技术领域,更具体地,涉及根本原因分析方法、装置、电子设备、介质以及程序产品
背景技术
在制造工业中,在比如机器设备、生产调度和执行、产品、库存等多个有关方面,都可以使用关键性能指标(KPI)作为评价和控制生产过程的重要工具。
但是如果没有对这些方面进行有效的分析,那么只能提供非常有限的见解。分析的价值极大地取决于数据的质量、数据/场景语境、使用的工具和分析技巧。
异常KPI或者性能降级的根本原因分析法是KPI分析的常见场景。尤其对于制造过程,异常KPI通常表示在失败的情况下可能导致经济损失的潜在风险。由于以下原因,使得根本原因分析是一种非常挑战的任务:
1、故障传播和累积效应
2、复杂的规则组合
3、巨大的数据量和数据类型
领域知识是进行这样的分析的基础。但是现实中,瓶颈在于利用人的个人能力来进行知识融合、处理和推理。例如,判断一个或/与门是否正常工作非常容易,但是如果两个门串联连接或者并联连接,则需要更多的工作来发现故障。可以想象,如果有数百个这样的“简单”元素组合在一起,对于人力来说几乎是不可能完成的任务。
特别是,云平台可以从不同的信息孤岛(silo)收集数据,并且进行可视化。但是云本身在语义上无法链接数据,并且在进行分析之前需要执行大量的提取转换加载(ETL)任务。希望能够有一种解决方案,能够在传播链上嵌入领域知识,并且在语义上集成数据。
根本原因分析法可以被分为两种主要类型:基于数据和基于推导。根本原因分析法的应用领域非常广泛,而不只限于工厂过程控制、计算机系统性能、程序调试、无线网络等。
参考文献1(WO2017118380A1,“蜂窝系统中指纹识别根本原因分析”),涉及根据历史性能数据学习规则来表征指标之间的关联,监测异常并且匹配规则。
参考文献2(US20150074035A1,“使用因果贝叶斯网络检测交易降级的根本原因”),状态与应用交易和部件关联,确定的状态被用作输入来构建贝叶斯网络,并且通过遍历贝叶斯网络来推导根本原因集合。
参考文献3(US10210189B2,“性能问题的根本原因分析”),基于受监控的KPI和数据库性能输出来计算数据库性能值。为了确定数据库性能值低于阈值,生成KPI相关系数和相关矩阵,并且用其确定目标函数。
发明内容
在下文中给出关于本发明的简要概述,以便提供关于本发明的某些方面的基本理解。应当理解,这个概述并不是关于本发明的穷举性概述。它并不是意图确定本发明的关键或重要部分,也不是意图限定本发明的范围。其目的仅仅是以简化的形式给出某些概念,以此作为稍后论述的更详细描述的前序。
根据本公开的一个方面,提供了根本原因分析方法,包括:从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点;以及基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
可选地,在上述方面的一个示例中,在从添加注解的知识图谱中提取传播图之前,所述方法还包括:基于本体模型构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个传播属性,其中,所述传播属性包括方向和语境两个子属性,所述方向表示两个节点之间的传播方向,所述语境表示要进行根本原因分析的场景。
可选地,在上述方面的一个示例中,从添加注解的知识图谱中提取传播图包括:利用查询语句从添加注解的所述知识图谱中提取与所述异常节 点有关的传播图。
可选地,在上述方面的一个示例中,所述传播图中还包括节点之间的传播方向。
可选地,在上述方面的一个示例中,基于所述传播图中的节点的属性分析所述异常节点出现异常情况的根本原因包括:
对所述传播图中的全部节点按照节点的权重降序进行排列;
从排序的第一个节点开始执行以下操作:
基于影响节点的状态的至少一个因素来计算该节点的异常概率;
如果异常概率大于预定阈值,则确定该节点是引起异常情况的根本原因,停止操作;
否则,针对下一个节点执行所述操作,直至对所有节点都执行所述操作。
可选地,在上述方面的一个示例中,基于以下各项中的至少一项来确定所述节点的权重:该节点到所述异常节点的路径的数量、该节点到所述异常节点的距离以及节点是否是中介节点。
可选地,在上述方面的一个示例中,基于影响节点的状态的至少一个因素来计算该节点的异常概率包括:基于影响节点状态的至少一个因素中的显性因素和隐性因素分别对于所述异常情况的影响程度以及各个因素的移动平均差来计算该节点的异常概率。
根据本公开的另一方面,提供了根本原因分析装置,包括:传播图提取单元,被配置为从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点;以及分析单元,被配置为基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
可选地,在上述方面的一个示例中,所述装置还包括:知识图谱注解单元,被配置为基于本体模型构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个注解属性,其中,所述注解属性包括方向和语境两个子属性,所述方向表示两个节点之间的传播方向,所述语境表示要进行根本原因分析的场景。
可选地,在上述方面的一个示例中,所述传播图提取单元进一步被配 置为:利用查询语句从添加注解的所述知识图谱中提取与所述异常节点有关的传播图。
可选地,在上述方面的一个示例中,所述传播图中还包括节点之间的传播方向。
可选地,在上述方面的一个示例中,所述分析单元进一步被配置为:
对所述传播图中的全部节点按照节点的权重降序进行排列;
从排序的第一个节点开始执行以下操作:
基于影响节点的状态的至少一个因素来计算该节点的异常概率;
如果异常概率大于预定阈值,则认为该节点是发生异常情况的根本原因,停止操作;
否则,针对下一个节点执行所述操作,直至对所有节点都执行所述操作。
可选地,在上述方面的一个示例中,所述节点的权重是基于以下各项中的至少一项来确定的:该节点到所述异常节点的路径的数量、该节点到所述异常节点的距离、该节点是否是中介节点。
可选地,在上述方面的一个示例中,所述分析单元进一步被配置为:基于影响节点状态的至少一个因素中的显性因素和隐性因素分别对于所述异常情况的影响程度以及各个因素的移动平均差来计算该节点的异常概率。
根据本公开的另一方面,提供了电子设备,包括:至少一个处理器;以及与所述至少一个处理器耦合的一个存储器,所述存储器用于存储指令,当所述指令被所述至少一个处理器执行时,使得所述处理器执行如上所述的方法。
根据本公开的另一方面,提供了一种非暂时性机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如上所述的方法。
根据本公开的另一方面,提供了一种计算机程序,包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行如上所述的方法。
根据本公开的另一方面,提供了一种计算机程序产品,所述计算机程 序产品被有形地存储在计算机可读介质上并且包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行如上所述的方法。
根据本公开的方法和装置,可以将故障传播的知识嵌入到本体中,充分利用了知识图谱对于大规模数据集成的优势,并且只需要花费很小的工作量来构建一个添加有注解的传播图。
根据本公开的方法和装置,通过统一的架构,可以跨系统、格式和位置来语义上集成并且链接数据。从而可以突破数据量和类型的限制,并且实现更复杂的数据应用和分析任务。
根据本公开的方法和装置,利用基于本体模型的统一的集成模板,可以确保数据的质量,解决数据孤岛问题。
附图说明
参照下面结合附图对本发明实施例的说明,会更加容易地理解本发明的以上和其它目的、特点和优点。附图中的部件只是为了示出本发明的原理。在附图中,相同的或类似的技术特征或部件将采用相同或类似的附图标记来表示。
图1是示出了根据本公开的一个实施例的根本原因分析方法的示例性过程的流程图;
图2A-2C分别是表示不同传播关系的示意图;
图3是一个传播图的具体示例的示意图;
图4是另一个传播图的具体示例的示意图;
图5是又一个传播图的具体示例的示意图;
图6是示出图1中的方框S104的操作的一种示例性过程的流程图;
图7是示出了根据本公开的一个实施例的根本原因分析装置700的示例性配置的框图;以及
图8示出了根据本公开的实施例的根本原因分析的电子设备1000的方框图。
附图标记
100:根本原因分析方法         S101、S102、S104、S1042、
                              S1044、S1046、S1048、S1049:
                              步骤
201:父对象                   202:子对象
203:对象1                    204:对象2
205:对象3                    206:对象4
300、400、500:传播图         PL:生产线
U1…Un:单元1…单元n          M1…Mk:机床1…机床k
M:机床                       LA:生产线A
U1:单元1                     M1:机床1
S1:传感器1                   U2:单元2
M2:机床2                     S2:传感器2
CT:周期时间                  S3:传感器3
W:工人                       WO:工作订单
WP:工作计划                  P:产品
M:材料                       PO:购买订单
SL:供应商                     WH:仓库
U3:单元3                      U4:单元4
LB:生产线B                   700:根本原因分析装置
702:传播图提取单元           704:分析单元
701:知识图谱注解单元         800:电子设备
802:处理器                   804:存储器
具体实施方式
现在将参考示例实施方式讨论本文描述的主题。应该理解,讨论这些实施方式只是为了使得本领域技术人员能够更好地理解从而实现本文描述的主题,并非是对权利要求书中所阐述的保护范围、适用性或者示例的限制。可以在不脱离本公开内容的保护范围的情况下,对所讨论的元素的功能和排列进行改变。各个示例可以根据需要,省略、替代或者添加各种过程或组件。例如,所描述的方法可以按照与所描述的顺序不同的顺序来执行,以及各个步骤可以被添加、省略或者组合。另外,相对一些示例所描 述的特征在其它例子中也可以进行组合。
如本文中使用的,术语“包括”及其变型表示开放的术语,含义是“包括但不限于”。术语“基于”表示“至少部分地基于”。术语“一个实施例”和“一实施例”表示“至少一个实施例”。术语“另一个实施例”表示“至少一个其他实施例”。术语“第一”、“第二”等可以指代不同的或相同的对象。下面可以包括其他的定义,无论是明确的还是隐含的。除非语境中明确地指明,否则一个术语的定义在整个说明书中是一致的。
本公开提供了一种在传播链中嵌入领域知识并且在语义上集成数据的解决方案。根据本公开实施例的方法可以利用知识库的辅助来减少对于人力的依赖性。
现在结合附图来描述根据本公开的实施例的根本原因分析方法。
图1是示出了根据本公开的一个实施例的根本原因分析方法100的示例性过程的流程图。
首先,在方框S102中,从添加注解(annotation)的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点。
在本公开中提出了一个传播图的概念。传播图是从一个添加了注解的知识图谱中所提取的一个子图。
在根据本公开的实施例的方法中,通过在知识图谱中添加注解来辅助生成传播图。具体地,定义了一个传播属性,并给知识图谱中的至少一对节点之间的关系添加一个传播属性,传播属性可以包括方向和语境两个子属性,其中,方向表示两个节点之间的传播方向,语境用于限定知识图谱涉及的具体场景。
传播是表示两个事件之间的关系:第二事件在第一事件之后发生,该第一事件以某种可能性作为该第二事件发生的原因。
一般来说,两个事件之间的传播关系可以包括以下三种。
第一种是包含(inclusion)传播,包含传播是从子到父的流,如图2A所示,从子对象202到父对象201的流,即子故障会引起父故障。例如,坏的刀具可能会有90%的可能性使得CNC(Computerised Numerical Control,计算机数字控制)机床不可用。一般来说,包含传播模式可能适 用于大多数分级系统结构,例如ISA-95企业模型。
第二种是上下游传播,上下游传播是从上游至下游的流,如图2B所示,从对象1 203到对象2 204的流。即,上游故障引起下游故障。例如,机器1的较长处理时间引起机器2的较长等待时间,或者供应者的延迟引起产品计划的延迟。这种上下游传播模式可以用于描述独立事件之间的关系。
第三种是等效关系,如图2C所示,对象3 205与对象4 206是等效的。流等效的意思是没有(不需要)传播,即,如果对象3有故障,则对象4也一定会有故障。这种模式可以用于描述绑定的事件,例如产品质量和质量测试就是等效关系(在不考虑测试误差本身的情况下)。
基于本体模型构建的知识图谱是一个三元数据集,其包括对象和对象之间的关系,这里的关系也是方向性的。但是传播方向可能与本体中定义的一般关系方向不同。换句话说,传播是一种特定的关系类型,其由客观事实决定而不是主观描述。例如,从A到B的关系“A具有部件B”等价于从B到A的关系“B是A的部件”。但是对于“具有部件”和“是…的部件”二者,传播方向总是从B至A(包含模式的传播关系)。
因此,在根据本公开的实施例中,所添加的传播属性中包括了方向这个子属性,来表示两个节点之间的关系的传播方向与本体中定义的方向是否一致。
下面的代码示出了添加的注解的一个示例。
Figure PCTCN2019107571-appb-000001
从这段代码可以看出,属性“propagation(传播)”具有两个子属性:“context(语境)”和“direction(方向)”。以“hasPart(具有部件)”为例,这个注解意味着在跟踪“CycleTime(周期时间)”的异常情况的语境中,传播方向与属性“hasPart”的方向相反。
此外,所有的传播关系应该用专用的语境或场景来定义。因为两个事 件之间的关系在不同的语境中可能不同。例如,在考虑测试误差的情况下,上面所述的产品质量和质量测试之间的关系应该是上下游关系,因为即使是完美的产品也可能有失败的测试结果。
因此,在根据本公开的实施例中,所添加的传播属性还包括语境,语境可以限定知识图谱涉及的具体场景。此外,语境还可以限定传播图的边界。
综上所述,通过给基于本体模型构建的知识图谱中的节点之间的关系添加一个传播属性,得到添加注解的知识图谱,然后就可以从知识图谱中提取传播图。
在一个示例中,可以利用查询语句(比如SPARQL)从知识图谱中提取传播图。本领域技术人员可以理解通过用查询语言编写查询语句来从知识图谱中提取与出现异常情况的节点有关的传播图的具体操作,在此不再详述。
图3示出了一个生产线的传播图的具体示例300的示意图。图3中包括生产线PL,单元1 U1至单元n Un,机床1至机床k以及机床M。
图3所示的传播图建模为垂直分层和水平连接的树,这是一个包含关系和上下游关系的组合。
图4是一个生产管理的传播图400的示意图。
这是一个更具体的示例,示出了在KPI为“生产线A的周期时间”的语境中的传播图。其中包括生产线A LA,单元1 U1,机床1 M1,传感器1 S1,单元2 U2,机床2 M2,传感器2 S2,周期时间CT,传感器3 S3,工人W,工作订单WO,工作计划WP,产品P,材料M,购买订单PO,供应商SL,仓库WH、单元3 U3,单元4 U4和生产线B LB。图3的左边部分是有关的物理对象/设施,而右边部分是有关的信息对象。图3中用虚线框表示生产线A LA的周期时间CT出现异常情况。虽然KPI由物理对象“生产线A”主导,然而,可能对KPI产生影响的因素并不限于物理对象。实际上,整个传播图是通过枚举每对对象节点的传播关系而建立。
在一个示例中,在执行方框S102中的提取传播图的操作之前,可以先执行方框S101中的操作:基于本体模型构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个传播属性。其中,传播属性可以 包括方向和语境两个子属性,所述方向表示两个节点之间的传播方向,所述语境表示要进行根本原因分析的场景。
本领域技术人员可以理解,在另一个示例中,可以预先构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个传播属性,然后将这个添加注解的知识图谱存储在介质中供查询使用,而无需每次执行方框S101中的操作。
在生成传播图之后,可以执行方框S104中的操作:基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
对于根本原因分析,关键在于评估一个候选原因作为根本原因的可能性。在根据本公开实施例的方法中提出了两个指标来对节点进行评估:节点的权重和节点的异常概率。
1、节点的权重
节点的权重用来表示一个节点在整个传播图中的重要程度。一个节点的重要程度例如可以基于该节点到出现异常情况的起始节点的路径的数量、该节点到起始节点的距离以及该节点是否是中介节点等来确定。
在一个示例中,可以用下面的式(1)来计算一个节点的权重。
Figure PCTCN2019107571-appb-000002
其中,path_num表示从节点n到m的路径的数量;dist(n,m)表示从节点n到m的最短距离。
这里节点m表示开始出现异常情况的起始节点(也称为异常节点),例如,在图4中即为用虚线框表示的“周期时间CT”。节点n表示沿着传播流的任何节点。路径的数量是表示从异常节点到该节点有多少条路径;最短距离是表示该节点到异常节点经过的节点数目。
本领域技术人员可以理解,节点权重的计算并不限于上面的式(1),而可以采用任何与该节点到异常节点的路径的数量、该节点到异常节点的距离以及该节点是否是中介节点等因素有关的函数来计算,并且所考虑的因素也可以不限于这几种,还可以包括任何其他与节点的重要程度有关的因素,比如可以跟下文中提到的节点的条件概率有关,在此不再详述。
图5示出了与节点“周期时间CT”有关的传播图,其中虚线框中的节 点“周期时间CT”表示出现异常情况的异常节点。下面的表1示出了图5中的每个节点的权重。
节点 路径的数量 n到m的最短距离 权重
生产线LA - 0 -
单元1 U1 1 1 1
单元2 U2 1 1 1
机床1 M1 3 2 1.5
机床2 M2 2 2 1
传感器1 S1 0 - -
传感器2 S2 1 1 1
传感器3 S3 1 1 1
表1
在图5中,“生产线A LA”被定义为与“周期时间CT”等效,因此它们都是起点。“传感器1 S1”没有到“生产线A LA”或“周期时间CT”的路径,因此应该不在异常情况的原因的范围内,所以可以看到在表1中传感器1 S1没有权重。
2、节点的异常概率
节点的异常概率是用于表示一个节点是否处于异常状态的可能性。在根据本公开实施例的方法中,确定一个节点处于异常状态的可能性的因素
包括显性因素和隐性因素。显性因素是在节点的属性中明确指明的因素,例如KPI公式的输入;其余的因素可以称为隐性因素,比如环境的温度、湿度等。
在一个示例中,为了评估节点是否异常,可以针对其每一个因素分别计算移动平均差,移动平均差是测量一个因素的短期行为与长期行为的偏差,例如可以是一个因素的值相对于其常态值的变化程度,这里移动窗口的长度可以根据情况来指定。通过计算每一个因素的移动平均差可以对每 一个因素进行量化,从而基于量化的因素来计算节点的异常概率。
在一个示例中,例如可以通过下面的式(2)来计算节点的异常概率。
Figure PCTCN2019107571-appb-000003
其中,p ab是节点的异常概率,w i是因素的权重,diff(f i)是移动平均差。
因素的权重是表示该因素对于异常状态可能性估计的贡献。例如,显性因子一般来说比隐性因子更加重要。如果关于该因素没有先验知识,可以设定一个缺省值。
通过上面的式(2)可以将一个节点的所有因素的移动平均差加权求和来计算该节点的异常概率。如果计算出来的异常概率大于预定阈值,则可以认为该节点是异常的,即该节点可是发生异常情况的根本原因节点。
本领域技术人员可以理解,计算节点的异常概率的方式并不限于上面式(2)的方式,例如还可以采用以下方式来计算:用每个节点的各个因素的不同数据作为样本数据进行机器学习,再利用机器学习获得的模型来计算一个节点的异常概率。因此,在根据本公开实施例的方法中,对于基于影响节点的状态的因素来计算节点的异常概率的具体方式不做任何限定。
在一个示例中,给节点添加的注解中还可以包括节点的概率信息,这个概率信息可以表示该节点与其父节点之间的因果关系的不确定性。这样所构成的传播图就类似于一个贝叶斯(Bayesian)网络。传播图中的每个节点有一个概率信息表,其中包括该节点与其直接父节点之间的条件概率。可以理解,在不同的语境下,一个节点可能有不同的概率信息表。
在添加的注解中包括了节点的概率信息的情况下,在计算节点的权重时,可以将节点的概率信息作为一个因素。如果在注解中不包括节点的概率信息,就相当于概率为1或0,即两个节点之间存在传播关系或者不存在传播关系。
通过以上所述,就可以针对每一个节点确定节点的权重和节点的异常概率,接着可以开始进行根本原因分析。在根本原因分析的过程中,节点权重主要用于确定从哪个节点开始以及按照什么顺序对节点进行分析,以提高分析的速度和效率;节点的异常概率是用于判断该节点是否为异常情况的根本原因。下面参照图6说明基于所述传播图中的节点的属性分析所 述异常节点出现异常情况的根本原因的一种示例性过程。
图6是示出方框S104中的操作的一种示例性过程的流程图。
首先,在方框S1042中,对所述传播图中的全部节点按照节点的权重降序进行排列。
接着,从排序的第一个节点开始,执行以下操作:
在方框S1044中:基于影响节点的状态的至少一个因素来计算该节点的异常概率;
在判断框S1046中:确定所计算的异常概率是否大于预定阈值,如果异常概率大于预定阈值,图中表示为Y,则进行到方框S1048中的操作;否则,图中表示为N,则对下一个节点执行方框S1044中的操作。
在方框S1048中:确定该节点是引起异常情况的根本原因。
最后,在S1049中结束操作。
这里的预定阈值,可以由本领域技术人员按照经验预先设定。
通过图6所示的分析异常节点出现异常情况的根本原因的示例性过程,就可以确定出引起异常情况的根本原因。
本领域技术人员可以理解,根据本公开实施例的分析方法也有可能在传播图中没有找到引起异常情况的根本原因,在这种情况下,就在对所有节点执行完操作之后,结束操作。
根据本公开的方法,针对现有技术中存在的问题,从以下几个方面进行了改进。
与故障推导相比,直接链接的两个“简单”元素之间的传播更加确定,并且可以更容易地澄清。通过进行注解,可以将这样的领域知识嵌入在知识库中。从而将大型网络内的故障传播转换为子图提取问题。
注解包括方向和语境两个方面。为了定义注解,首先应该将领域知识分解为最小元素,这意味着对于传播的非常具体和确定的描述。语境可以限定传播图的边界来减小复杂度。在图形算法的帮助下,定量指标有助于规则满足评价。例如,通过从设备A到设备B的距离可以评价设备A对设备B的影响水平。
利用本体&知识图谱,通过统一的架构,可以跨系统、格式和位置来语义上集成并且链接数据。从而可以突破数据量和类型的限制,并且实现更 复杂的数据应用和分析任务。
根据本公开的方法,可以将故障传播的知识嵌入到本体中,充分利用了知识图谱对于大规模数据集成的优势,并且只需要花费很小的工作量来构建一个添加有注解的传播图。
针对数据孤岛问题,基于知识图谱存在很多成熟的技术方案。本体模型是用于集成的统一模板,从而可以确保数据的质量。
通过给知识图谱添加注解,将传播图的构建转换为一个查询任务。并且可以根据包括在注解中的语境的输入来动态地调整查询结果。
数字化是基于可访问的数据以及系统的知识。得益于物联网技术,数据获取突破了瓶颈,使得更多的事件可以互联。数字化的目的并不止是“数字化”本身,而是利用数据和知识获得附加价值。根据本发明的方法提供了从数据获得商业见解的很好的技术方案。尤其是,本发明不限于KPI的跟踪问题,而是提出了一种用于评价个体对于整个网络的影响的方法。在制造行业的背景下,可以是:设备/物料/人的重要性;质量跟踪;订单交付分析等等。大多数上述问题由于复杂性和对于分析技巧的依赖性而具有很大的挑战性。通过图谱辅助的推理可以节省人力,帮助人们用图谱来进行更准确的判断。
图7是示出了根据本公开的一个实施例的根本原因分析装置700的示例性配置的框图。
如图7所示,根本原因分析装置700包括:一个传播图提取单元702和一个分析单元704。
传播图提取单元702被配置为从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点。
分析单元704被配置为基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
在一个示例中,根本原因分析装置700还可以包括一个知识图谱注解单元701,被配置为基于本体模型构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个注解属性,其中,所述注解属性包括方 向和语境两个子属性,所述方向表示两个节点之间的传播方向,所述语境限定知识图谱涉及的具体场景。
其中,传播图提取单元702进一步被配置为:利用查询语句从添加注解的所述知识图谱中提取与所述异常节点有关的传播图。
其中,所述传播图中还包括节点之间的传播方向。
其中,所述分析单元704进一步被配置为:
对所述传播图中的全部节点按照节点的权重降序进行排列;
从排序的第一个节点开始执行以下操作:
基于影响节点的状态的至少一个因素来计算该节点的异常概率;
如果异常概率大于预定阈值,则认为该节点是发生异常情况的根本原因,停止操作;
否则,针对下一个节点执行所述操作,直至对所有节点都执行所述操作。
其中,所述节点的权重基于以下各项中的至少一项来确定:
该节点到所述异常节点的路径的数量、该节点到所述异常节点的距离、以及该节点是否是中介节点。
所述分析单元704进一步被配置为:基于影响节点状态的至少一个因素中的显性因素和隐性因素分别对于所述异常情况的影响程度以及各个因素的移动平均差来计算该节点的异常概率。
根本原因分析装置700的各个部分的操作和功能的细节例如可以与参照结合图1-6描述的本公开的根本原因分析方法100的实施例的相关部分相同或类似,这里不再详细描述。
在此需要说明的是,图7所示的根本原因分析装置700及其组成单元的结构仅仅是示例性的,本领域技术人员可以根据需要对图7所示的结构框图进行修改。
如上参照图1到图7,对根据本公开的实施例的根本原因分析方法和装置的实施例进行了描述。以上所述的根本原因分析装置可以采用硬件实现,也可以采用软件或者硬件和软件的组合来实现。
图8示出了根据本公开的实施例的进行根本原因分析的电子设备800的方框图。根据一个实施例,电子设备800可以包括至少一个处理器802, 处理器802执行在计算机可读存储介质(即,存储器804)中存储或编码的至少一个计算机可读指令(即,上述以软件形式实现的元素)。
在一个实施例中,在存储器804中存储计算机可执行指令,其当执行时使得至少一个处理器802完成以下动作:从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点;以及基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
应该理解,在存储器804中存储的计算机可执行指令当执行时使得至少一个处理器802进行本公开的各个实施例中以上结合图1-7描述的各种操作和功能。
根据一个实施例,提供了一种非暂时性机器可读介质。该非暂时性机器可读介质可以具有机器可执行指令(即,上述以软件形式实现的元素),该指令当被机器执行时,使得机器执行本公开的各个实施例中以上结合图1-7描述的各种操作和功能。
根据一个实施例,提供了一种计算机程序,包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行本公开的各个实施例中以上结合图1-7描述的各种操作和功能。
根据一个实施例,提供了一种计算机程序产品,包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行本公开的各个实施例中以上结合图1-7描述的各种操作和功能。
上面结合附图阐述的具体实施方式描述了示例性实施例,但并不表示可以实现的或者落入权利要求书的保护范围的所有实施例。在整个本说明书中使用的术语“示例性”意味着“用作示例、实例或例示”,并不意味着比其它实施例“优选”或“具有优势”。出于提供对所描述技术的理解的目的,具体实施方式包括具体细节。然而,可以在没有这些具体细节的情况下实施这些技术。在一些实例中,为了避免对所描述的实施例的概念造成难以理解,公知的结构和装置以框图形式示出。
本公开内容的上述描述被提供来使得本领域任何普通技术人员能够实现或者使用本公开内容。对于本领域普通技术人员来说,对本公开内容进行的各种修改是显而易见的,并且,也可以在不脱离本公开内容的保护范 围的情况下,将本文所定义的一般性原理应用于其它变型。因此,本公开内容并不限于本文所描述的示例和设计,而是与符合本文公开的原理和新颖性特征的最广范围相一致。

Claims (18)

  1. 根本原因分析方法,包括:
    从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点;以及
    基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
  2. 如权利要求1所述的方法,在从添加注解的知识图谱中提取传播图之前,所述方法还包括:
    基于本体模型构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个传播属性,其中,所述传播属性包括方向和语境两个子属性,所述方向表示两个节点之间的传播方向,所述语境表示要进行根本原因分析的场景。
  3. 如权利要求1所述的方法,其中,从添加注解的知识图谱中提取传播图包括:
    利用查询语句从添加注解的所述知识图谱中提取与所述异常节点有关的传播图。
  4. 如权利要求2所述的方法,其中,所述传播图中还包括节点之间的传播方向。
  5. 如权利要求1-4中任意一项所述的方法,其中,基于所述传播图中的节点的属性分析所述异常节点出现异常情况的根本原因包括:
    对所述传播图中的全部节点按照节点的权重降序进行排列;
    从排序的第一个节点开始执行以下操作:
    基于影响节点的状态的至少一个因素来计算该节点的异常概率;
    如果异常概率大于预定阈值,则确定该节点是引起异常情况的根本原因,停止操作;
    否则,针对下一个节点执行所述操作,直至对所有节点都执行所述操作。
  6. 如权利要求5所述的方法,其中,基于以下各项中的至少一项来确定所述节点的权重:
    该节点到所述异常节点的路径的数量、该节点到所述异常节点的距离以及该节点是否是中介节点。
  7. 如权利要求5所述的方法,其中,基于影响节点的状态的至少一个因素来计算该节点的异常概率包括:
    基于影响节点状态的至少一个因素中的显性因素和隐性因素分别对于所述异常情况的影响程度以及各个因素的移动平均差来计算该节点的异常概率。
  8. 根本原因分析装置(700),包括:
    传播图提取单元(702),被配置为从添加注解的知识图谱中提取传播图,其中,所述传播图包括出现异常情况的异常节点以及与所述异常节点存在传播关系的节点;以及
    分析单元(704),被配置为基于所述传播图中的节点的属性来分析所述异常节点出现异常情况的根本原因。
  9. 如权利要求8所述的装置(700),还包括:
    知识图谱注解单元(701),被配置为基于本体模型构建知识图谱,并给构建的知识图谱中的至少一对节点之间的关系添加一个注解属性,其中,所述注解属性包括方向和语境两个子属性,所述方向表示两个节点之间的传播方向,所述语境表示要进行根本原因分析的场景。
  10. 如权利要求8所述的装置(700),其中,所述传播图提取单元(702)进一步被配置为:
    利用查询语句从添加注解的所述知识图谱中提取与所述异常节点有关 的传播图。
  11. 如权利要求9所述的装置(700),其中,所述传播图中还包括节点之间的传播方向。
  12. 如权利要求8-11中任意一项所述的装置(700),其中,所述分析单元(704)进一步被配置为:
    对所述传播图中的全部节点按照节点的权重降序进行排列;
    从排序的第一个节点开始执行以下操作:
    基于影响节点的状态的至少一个因素来计算该节点的异常概率;
    如果异常概率大于预定阈值,则认为该节点是发生异常情况的根本原因,停止操作;
    否则,针对下一个节点执行所述操作,直至对所有节点都执行所述操作。
  13. 如权利要求12所述的装置(700),其中,所述节点的权重是基于以下各项中的至少一项来确定的:
    该节点到所述异常节点的路径的数量、该节点到所述异常节点的距离以及该节点是否是中介节点。
  14. 如权利要求12所述的装置,其中,所述分析单元(704)进一步被配置为:
    基于影响节点状态的至少一个因素中的显性因素和隐性因素分别对于所述异常情况的影响程度以及各个因素的移动平均差来计算该节点的异常概率。
  15. 电子设备(800),包括:
    至少一个处理器(802);以及
    与所述至少一个处理器(802)耦合的一个存储器(804),所述存储器用于存储指令,当所述指令被所述至少一个处理器(802)执行时,使得所 述处理器(802)执行如权利要求1到7中任意一项所述的方法。
  16. 一种非暂时性机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如权利要求1到7中任意一项所述的方法。
  17. 一种计算机程序,包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行根据权利要求1至7中任意一项所述的方法。
  18. 一种计算机程序产品,所述计算机程序产品被有形地存储在计算机可读介质上并且包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行根据权利要求1至7中任意一项所述的方法。
PCT/CN2019/107571 2019-09-24 2019-09-24 根本原因分析方法、装置、电子设备、介质以及程序产品 WO2021056197A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/107571 WO2021056197A1 (zh) 2019-09-24 2019-09-24 根本原因分析方法、装置、电子设备、介质以及程序产品
CN201980100087.0A CN114341877A (zh) 2019-09-24 2019-09-24 根本原因分析方法、装置、电子设备、介质以及程序产品

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/107571 WO2021056197A1 (zh) 2019-09-24 2019-09-24 根本原因分析方法、装置、电子设备、介质以及程序产品

Publications (1)

Publication Number Publication Date
WO2021056197A1 true WO2021056197A1 (zh) 2021-04-01

Family

ID=75165573

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107571 WO2021056197A1 (zh) 2019-09-24 2019-09-24 根本原因分析方法、装置、电子设备、介质以及程序产品

Country Status (2)

Country Link
CN (1) CN114341877A (zh)
WO (1) WO2021056197A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360720A (zh) * 2021-06-24 2021-09-07 平安普惠企业管理有限公司 基于数据血缘关系的数据资产可视化方法、装置及设备
CN114598539A (zh) * 2022-03-16 2022-06-07 京东科技信息技术有限公司 根因定位方法、装置、存储介质及电子设备
CN114779747A (zh) * 2022-05-11 2022-07-22 中国第一汽车股份有限公司 车辆故障原因确定系统及方法
CN114932929A (zh) * 2022-05-31 2022-08-23 交控科技股份有限公司 列车的控制方法、装置、设备、存储介质及程序产品
CN116360387A (zh) * 2023-01-18 2023-06-30 北京控制工程研究所 融合贝叶斯网络和性能-故障关系图谱的故障定位方法
WO2023159574A1 (zh) * 2022-02-28 2023-08-31 西门子股份公司 异常检测方法、装置、计算机可读介质和电子装置
WO2023230788A1 (zh) * 2022-05-30 2023-12-07 西门子股份公司 半自动知识图谱构建的方法、装置、计算机设备
CN114932929B (zh) * 2022-05-31 2024-05-03 交控科技股份有限公司 列车的控制方法、装置、设备、存储介质及程序产品

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102496028A (zh) * 2011-11-14 2012-06-13 华中科技大学 一种复杂装备的事后维修故障分析方法
US20120209798A1 (en) * 2008-03-08 2012-08-16 Tokyo Electron Limited Autonomous biologically based learning tool
CN109522192A (zh) * 2018-10-17 2019-03-26 北京航空航天大学 一种基于知识图谱和复杂网络组合的预测方法
CN109992440A (zh) * 2019-04-02 2019-07-09 北京睿至大数据有限公司 一种基于知识图谱和机器学习的it根故障分析识别方法
CN110110870A (zh) * 2019-06-05 2019-08-09 厦门邑通软件科技有限公司 一种基于事件图谱技术的设备故障智能监控方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120209798A1 (en) * 2008-03-08 2012-08-16 Tokyo Electron Limited Autonomous biologically based learning tool
CN102496028A (zh) * 2011-11-14 2012-06-13 华中科技大学 一种复杂装备的事后维修故障分析方法
CN109522192A (zh) * 2018-10-17 2019-03-26 北京航空航天大学 一种基于知识图谱和复杂网络组合的预测方法
CN109992440A (zh) * 2019-04-02 2019-07-09 北京睿至大数据有限公司 一种基于知识图谱和机器学习的it根故障分析识别方法
CN110110870A (zh) * 2019-06-05 2019-08-09 厦门邑通软件科技有限公司 一种基于事件图谱技术的设备故障智能监控方法

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360720A (zh) * 2021-06-24 2021-09-07 平安普惠企业管理有限公司 基于数据血缘关系的数据资产可视化方法、装置及设备
CN113360720B (zh) * 2021-06-24 2023-11-21 湖北华中电力科技开发有限责任公司 基于数据血缘关系的数据资产可视化方法、装置及设备
WO2023159574A1 (zh) * 2022-02-28 2023-08-31 西门子股份公司 异常检测方法、装置、计算机可读介质和电子装置
CN114598539A (zh) * 2022-03-16 2022-06-07 京东科技信息技术有限公司 根因定位方法、装置、存储介质及电子设备
CN114598539B (zh) * 2022-03-16 2024-03-01 京东科技信息技术有限公司 根因定位方法、装置、存储介质及电子设备
CN114779747A (zh) * 2022-05-11 2022-07-22 中国第一汽车股份有限公司 车辆故障原因确定系统及方法
WO2023230788A1 (zh) * 2022-05-30 2023-12-07 西门子股份公司 半自动知识图谱构建的方法、装置、计算机设备
CN114932929A (zh) * 2022-05-31 2022-08-23 交控科技股份有限公司 列车的控制方法、装置、设备、存储介质及程序产品
CN114932929B (zh) * 2022-05-31 2024-05-03 交控科技股份有限公司 列车的控制方法、装置、设备、存储介质及程序产品
CN116360387A (zh) * 2023-01-18 2023-06-30 北京控制工程研究所 融合贝叶斯网络和性能-故障关系图谱的故障定位方法
CN116360387B (zh) * 2023-01-18 2023-09-15 北京控制工程研究所 融合贝叶斯网络和性能-故障关系图谱的故障定位方法

Also Published As

Publication number Publication date
CN114341877A (zh) 2022-04-12

Similar Documents

Publication Publication Date Title
WO2021056197A1 (zh) 根本原因分析方法、装置、电子设备、介质以及程序产品
US10996160B2 (en) Mitigating asset damage via asset data analysis and processing
US20170109636A1 (en) Crowd-Based Model for Identifying Executions of a Business Process
Park et al. Workload and delay analysis in manufacturing process using process mining
Li et al. Production systems engineering: Problems, solutions, and applications
Tu et al. A Bayes-SLIM based methodology for human reliability analysis of lifting operations
Al Obisat et al. Review of Literature on Software Quality.
Barros On the representation of time in modeling & simulation
Ronchieri et al. Metrics for software reliability: a systematic mapping study
Weiß Process capability analysis for serially dependent processes of Poisson counts
Wang Selecting verification and validation techniques for simulation projects: A planning and tailoring strategy
US11436514B2 (en) Designing plans using requirements knowledge graph
Puri et al. Genetic algorithm based approach for finding faulty modules in open source software systems
Schachinger et al. An advanced data analytics framework for energy efficiency in buildings
JP2015166991A (ja) 情報処理装置
Zhao et al. Relative predictability of failure event occurrences and its opacity-based test algorithm
US11526775B2 (en) Automatically evaluating application architecture through architecture-as-code
WO2022217712A1 (zh) 数据挖掘方法、装置、计算机设备及存储介质
Chatterjee Impact of multivariate normality assumption on multivariate process capability indices
Goosen A system to quantify industrial data quality
Li et al. Birnbaum importance analysis of supply chain fault risks based on binary decision diagram
Xu et al. Diagnosis of dense-time systems under event and timing masks
Joshi Survey of rapid software testing using machine learning
Yuan et al. Issues of intelligent data acquisition and quality for manufacturing decision-support in an Industry 4.0 context
Hakim et al. Identifying and localizing the inter-consistency errors among UML use cases and activity diagrams: An approach based on functional and structural size measurements

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946286

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19946286

Country of ref document: EP

Kind code of ref document: A1