CN110213087B - Complex system fault positioning method based on dynamic multilayer coupling network - Google Patents
Complex system fault positioning method based on dynamic multilayer coupling network Download PDFInfo
- Publication number
- CN110213087B CN110213087B CN201910405979.3A CN201910405979A CN110213087B CN 110213087 B CN110213087 B CN 110213087B CN 201910405979 A CN201910405979 A CN 201910405979A CN 110213087 B CN110213087 B CN 110213087B
- Authority
- CN
- China
- Prior art keywords
- fault
- network
- dynamic
- complex system
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010168 coupling process Methods 0.000 title claims abstract description 129
- 238000005859 coupling reaction Methods 0.000 title claims abstract description 129
- 230000008878 coupling Effects 0.000 title claims abstract description 128
- 238000000034 method Methods 0.000 title claims abstract description 59
- 230000003068 static effect Effects 0.000 claims abstract description 21
- 239000010410 layer Substances 0.000 claims description 129
- 238000012360 testing method Methods 0.000 claims description 37
- 230000005284 excitation Effects 0.000 claims description 28
- 230000007246 mechanism Effects 0.000 claims description 13
- 230000002159 abnormal effect Effects 0.000 claims description 10
- 238000004088 simulation Methods 0.000 claims description 10
- 230000003993 interaction Effects 0.000 claims description 9
- 238000012544 monitoring process Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000012502 risk assessment Methods 0.000 claims description 3
- 239000002365 multiple layer Substances 0.000 claims 4
- 230000005540 biological transmission Effects 0.000 claims 1
- 230000002452 interceptive effect Effects 0.000 claims 1
- 239000000126 substance Substances 0.000 claims 1
- 238000004458 analytical method Methods 0.000 abstract description 17
- 238000011161 development Methods 0.000 description 19
- 230000018109 developmental process Effects 0.000 description 19
- 230000005856 abnormality Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000000903 blocking effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 239000002356 single layer Substances 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
技术领域technical field
本发明提出了一种基于动态多层耦合网络的复杂系统故障定位方法,它涉及一种基于动态多层耦合网络的复杂系统故障定位方法,属于复杂性科学领域和可靠性技术领域。The invention provides a complex system fault location method based on dynamic multi-layer coupling network, which relates to a complex system fault location method based on dynamic multi-layer coupling network, and belongs to the field of complexity science and reliability technology.
背景技术Background technique
21世纪以来,以互联网为首的信息科学的进步在带动了世界工业与经济水平的快速发展的同时也给系统科学与系统工程带来了冲击。传统的简单系统无法解决日益多元化的应用需求,嵌入式技术、物联网、人工智能等新兴技术在大幅度提升系统性能的同时使得系统结构变得越来越复杂。以信息物理系统为例,其系统内部结构的信息层、物理层、网络层、决策层存在多层次非线性耦合关系,系统内部动态运行时存在功能涌现。当复杂系统发生失效时,其日益复杂的拓扑结构与非线性耦合的动力学特性使得故障具有动态传播、级联失效、多元故障等特点,如软硬件综合系统其失效模式常常为软硬件多元耦合失效、智能电网在单节点故障后故障随负载发生级联失效等等,且由于耦合故障的复杂性使即使执行相同测试用例时,复杂系统的故障也具有不确定性。而传统的可靠性分析手段常应用于单故障源头、静态分析的场景,对于这些机理复杂、结构耦合的故障类型使采用传统可靠性方法难以有效地实现故障定位分析,如何实现复杂系统故障定位成为一个亟待解决的问题。Since the 21st century, the progress of information science led by the Internet has not only driven the rapid development of the world's industrial and economic level, but also brought an impact on systems science and systems engineering. The traditional simple system cannot solve the increasingly diversified application requirements. Emerging technologies such as embedded technology, Internet of Things, and artificial intelligence have greatly improved the system performance while making the system structure more and more complex. Taking the information physical system as an example, the information layer, physical layer, network layer and decision-making layer of the internal structure of the system have multi-level nonlinear coupling relationships, and functions emerge when the system runs dynamically. When a complex system fails, its increasingly complex topological structure and nonlinear coupling dynamic characteristics make the fault have the characteristics of dynamic propagation, cascading failure, and multiple faults. In addition, due to the complexity of coupled faults, the failure of complex systems is uncertain even when the same test case is executed. Traditional reliability analysis methods are often used in single-fault source and static analysis scenarios. For these types of faults with complex mechanisms and structural coupling, it is difficult to use traditional reliability methods to effectively implement fault location analysis. How to achieve complex system fault location becomes a problem. An urgent problem to be solved.
复杂网络理论作为一种抽象描述复杂系统整体结构与系统内部个体间交互规律的模型方法,对于解决这个问题提供了一种新的解决思路。基于复杂网络理论的动态多层耦合网络是对复杂系统内部不同层次的动态运行机制的抽象模型,其形式为一组时间切片下的静态多层耦合网络的集合,静态多层耦合网络由多个单层网络以及单层网络间的耦合边集组成。相比于传统可靠性模型,该模型可以更真实反应复杂系统正常运行以及故障运行情况下系统内部动态演化机理,结合网络模型的统计指标,可以更高效地分析故障传播路径、探究级联失效机理、实现多故障源定位等可靠性分析与提升任务。As a model method that abstractly describes the overall structure of complex systems and the interaction rules between individuals within the system, complex network theory provides a new solution to this problem. The dynamic multi-layer coupled network based on the complex network theory is an abstract model of the dynamic operating mechanism at different levels within the complex system. Its form is a set of static multi-layer coupled networks under a set of time slices. It consists of single-layer networks and coupled edge sets between single-layer networks. Compared with the traditional reliability model, this model can more realistically reflect the internal dynamic evolution mechanism of the complex system under normal operation and fault operation. Combined with the statistical indicators of the network model, it can more efficiently analyze the fault propagation path and explore the cascading failure mechanism. , Realize reliability analysis and improvement tasks such as multi-fault source location.
本发明针对于以上的问题提出了一种有效的解决方案。本方案主要对如何对复杂系统的故障定位问题提出了一套解决方案,首先执行高故障概率的测试用例,分别建立复杂系统正常与故障的动态耦合网络模型,然后基于复杂网络统计指标实现复杂系统故障粗略定位,定位复杂系统发生性能降级甚至崩溃的时间段,然后实现该时间段下的复杂系统故障精细定位,通过依次分析该时间段内各个时间切片下静态多层耦合网络中统计指标突变的节点,最终实现复杂系统的故障定位。本发明采用的基于动态多层耦合网络的复杂系统故障定位方法,解决了运用传统可靠性分析手段难以实现对复杂系统的动态传播、级联失效、多元故障等复杂故障进行定位的问题。The present invention proposes an effective solution to the above problems. This scheme mainly proposes a set of solutions to the problem of how to locate faults in complex systems. First, test cases with high failure probability are executed, and dynamic coupled network models for normal and faulty complex systems are established respectively, and then complex systems are implemented based on complex network statistical indicators. Roughly locate faults, locate the time period when the performance of the complex system is degraded or even collapsed, and then realize the fine location of the complex system faults in this time period, and analyze the sudden changes of statistical indicators in the static multi-layer coupling network under each time slice in this time period in turn. nodes, and finally realize fault location of complex systems. The complex system fault location method based on the dynamic multi-layer coupling network adopted by the present invention solves the problem that it is difficult to locate complex faults such as dynamic propagation, cascading failure and multiple faults of complex systems by using traditional reliability analysis methods.
发明内容SUMMARY OF THE INVENTION
本发明主要提供一种基于动态多层耦合网络的复杂系统故障定位方法。传统可靠性分析手段常应用于单故障源头、静态分析的场景,在实现具有动态传播、级联失效、多元故障、不确定性等特点的复杂系统故障定位任务时存在局限性,难以实现故障定位分析。于是我们提出一种基于复杂网络理论可以有效实现复杂系统故障定位的方法。The invention mainly provides a complex system fault location method based on a dynamic multi-layer coupling network. Traditional reliability analysis methods are often used in single-fault source and static analysis scenarios. They have limitations in implementing complex system fault location tasks with dynamic propagation, cascading failures, multiple faults, and uncertainty, making it difficult to achieve fault location. analyze. Therefore, we propose a method based on complex network theory that can effectively realize fault location of complex systems.
针对以上的技术问题以及本发明的目的,本文提出了一种基于动态多层耦合网络的复杂系统故障定位方法,方案包括如下部分:In view of the above technical problems and the purpose of the present invention, this paper proposes a complex system fault location method based on a dynamic multi-layer coupled network. The scheme includes the following parts:
(一)发明目的(1) Purpose of the invention
针对现有技术中存在的缺陷,本发明的目的在于提供一种基于动态多层耦合网络的复杂系统故障定位方法,在考虑到复杂系统的结构复杂、故障存在动态传播、级联失效、多层耦合等特点,常应用于静态故障与单故障场景的传统可靠性分析手段难以实现对复杂系统的故障定位,采用本发明提出的基于动态多层耦合网络的复杂系统故障定位方法可以通过运用复杂网络理论的统计指标从宏观与微观两个角度对比分析复杂系统故障和正常动态多层耦合网络的异同,实现更清晰精准的复杂系统故障定位,便于系统设计人员了解系统的性能和故障机理,快速排查故障原因,优化系统可靠性指标。In view of the defects existing in the prior art, the purpose of the present invention is to provide a complex system fault location method based on a dynamic multi-layer coupled network, taking into account the complex structure of the complex system, dynamic propagation of faults, cascading failures, multi-layer faults Due to the characteristics of coupling and other characteristics, the traditional reliability analysis methods often used in static fault and single-fault scenarios are difficult to locate the fault of the complex system. By using the complex system fault location method based on the dynamic multi-layer coupling network proposed by the present invention, the complex network can be used. Theoretical statistical indicators compare and analyze the similarities and differences between complex system faults and normal dynamic multi-layer coupled networks from macro and micro perspectives, so as to achieve clearer and more accurate complex system fault location, which is convenient for system designers to understand system performance and failure mechanism, and quickly troubleshoot Fault causes, optimize system reliability indicators.
(二)技术方案(2) Technical solutions
为了实现上述目的,本发明的方法所采用的技术方案是:一种基于动态多层耦合网络的复杂系统故障定位方法。In order to achieve the above object, the technical solution adopted by the method of the present invention is: a method for locating faults in complex systems based on a dynamic multi-layer coupling network.
本发明一种基于动态多层耦合网络的复杂系统故障定位方法,其步骤如下:The present invention is a complex system fault location method based on dynamic multi-layer coupling network, and the steps are as follows:
步骤A:构建多模式下的动态多层耦合网络;Step A: Construct a dynamic multi-layer coupling network under multi-mode;
步骤B:定位复杂系统故障时段;Step B: locate the fault period of the complex system;
步骤C:定位复杂系统故障节点。Step C: Locate the faulty node of the complex system.
其中,步骤A中所述的“动态多层耦合网络”,其具体含义为:可以抽象反映复杂系统内部动态运行状态的动态网络模型,动态网络模型为一组基于时间切片的静态网络,由于复杂系统常具有多个层次且各层之间除了同层交互还有跨层耦合,如嵌入式系统的软件层与硬件层存在模块的映射关系,故为了更真实描述系统状态,动态网络模型为包含多层节点、同层连边以及耦合连边的多层耦合网络;在步骤A中所述的“构建多模式下的动态多层耦合网络”,包括以下步骤:The specific meaning of the "dynamic multi-layer coupled network" described in step A is: a dynamic network model that can abstractly reflect the internal dynamic running state of a complex system, and the dynamic network model is a set of static networks based on time slices. The system often has multiple layers, and each layer has cross-layer coupling in addition to the same-layer interaction. For example, there is a module mapping relationship between the software layer and the hardware layer of the embedded system. Therefore, in order to describe the system state more realistically, the dynamic network model contains Multi-layer coupling network of multi-layer nodes, same-layer connection edges and coupled connection edges; "Constructing a dynamic multi-layer coupling network under multi-mode" described in step A, including the following steps:
步骤A1:量化复杂系统的正常模式与故障模式;Step A1: Quantify the normal mode and failure mode of the complex system;
步骤A2:构建正常模式的动态多层耦合网络;Step A2: construct a dynamic multi-layer coupling network of normal mode;
步骤A3:构建故障模式的动态多层耦合网络;Step A3: Build a dynamic multi-layer coupling network of failure modes;
其中,在步骤A1中所述的“正常模式”,其具体含义为:复杂系统在执行某故障激发测试用例后,复杂系统处于正常运行的表现形式;Wherein, the "normal mode" described in step A1 specifically means: after the complex system executes a certain fault triggering test case, the complex system is in the form of normal operation;
其中,在步骤A1中所述的“故障模式”,其具体含义为:复杂系统在执行某故障激发测试用例后,复杂系统处于故障运行的表现形式;The specific meaning of the "failure mode" described in step A1 is: after the complex system executes a certain fault excitation test case, the complex system is in the form of fault operation;
其中,在步骤A1中所述的“量化复杂系统的正常模式与故障模式”,其具体做法如下:量化复杂系统状态监控工具对于正常模式与故障模式相对应的性能指标,即规范不同故障现象对应的监测指标,以及系统正常运行时的监测指标;以嵌入式系统这类典型的软硬件综合系统为例,对于操作系统执行某操作后,反应延迟超过10s,认定系统发生阻塞故障;Among them, the specific method of "quantifying the normal mode and failure mode of a complex system" described in step A1 is as follows: quantify the performance indicators of the complex system state monitoring tool corresponding to the normal mode and the failure mode, that is, to standardize the corresponding performance indicators of different failure phenomena. The monitoring indicators of the system, as well as the monitoring indicators when the system is running normally; take a typical integrated software and hardware system such as an embedded system as an example, after the operating system performs a certain operation, the response delay exceeds 10s, and the system is considered to have a blocking fault;
其中,在步骤A2中所述的“构建正常模式的动态多层耦合网络”,其具体做法如下:根据复杂系统的特性,选择合适的动态运行数据采样工具,执行某故障激发测试用例,在线采集在该故障激发测试用例下正常运行时的系统动态运行数据,然后基于复杂网络理论,抽取运行数据内部的节点交互关系,离线构建可以映射复杂系统正常模式运行机理的动态多层耦合网络;Among them, the specific method of "building a dynamic multi-layer coupling network in normal mode" described in step A2 is as follows: according to the characteristics of the complex system, select an appropriate dynamic operation data sampling tool, execute a fault excitation test case, and collect the data online. Based on the dynamic operation data of the system during normal operation under the fault excitation test case, then based on the complex network theory, the node interaction relationship within the operation data is extracted, and a dynamic multi-layer coupling network that can map the operation mechanism of the normal mode of the complex system is constructed offline;
其中,在步骤A3中所述的“构建故障模式的动态多层耦合网络”,其具体做法如下:根据复杂系统的特性,选择合适的动态运行数据采样工具,执行某故障激发测试用例,在线采集在该故障激发测试用例下故障运行时的系统动态运行数据,然后基于复杂网络理论,抽取运行数据内部的节点交互关系,离线构建可以映射复杂系统正常模式运行机理的动态多层耦合网络;Among them, the specific method of "building a dynamic multi-layer coupling network of fault mode" described in step A3 is as follows: according to the characteristics of the complex system, select an appropriate dynamic operation data sampling tool, execute a fault excitation test case, and collect the data online. Based on the dynamic operation data of the system during the fault operation under the fault excitation test case, then based on the complex network theory, the node interaction relationship within the operation data is extracted, and a dynamic multi-layer coupling network that can map the operation mechanism of the normal mode of the complex system is constructed offline;
其中,在步骤A2和步骤A3中所述的“动态多层耦合网络”,其具体含义为:一组以时间为序的静态多层耦合网络的集合,每个静态多层耦合网络对应着一个动态运行数据的时间切片,设经过分割后的动态运行数据共有m片,则动态多层耦合网络中应包含m组静态多层耦合网络;由于节点间调用关系特别是耦合层的节点间调用关系存在方向性,故动态耦合网络的连边为有向连边,动态耦合网络为有向网络。Among them, the "dynamic multi-layer coupling network" described in step A2 and step A3, its specific meaning is: a set of static multi-layer coupling networks in time order, each static multi-layer coupling network corresponds to a Time slicing of dynamic operation data, suppose there are m slices of dynamic operation data after segmentation, then the dynamic multi-layer coupling network should contain m groups of static multi-layer coupling networks; due to the calling relationship between nodes, especially the calling relationship between nodes in the coupling layer There is directionality, so the connected edges of the dynamic coupling network are directed edges, and the dynamic coupling network is a directed network.
其中,步骤B中所述的“定位复杂系统故障时段”,其具体含义为:通过对比步骤A构建的正常模式与故障模式下的动态多层耦合网络统计特征,发现故障发生的时段,并分别提取故障时段下故障模式与正常模式的动态多层耦合网络切片,为后续的步骤C的定位复杂系统故障节点提供支撑;包括以下步骤:Wherein, the specific meaning of "locating the fault period of a complex system" described in step B is: by comparing the statistical characteristics of the dynamic multi-layer coupling network under the normal mode constructed in step A and the fault mode, find the period of fault occurrence, and respectively Extracting the dynamic multi-layer coupling network slice of the fault mode and the normal mode under the fault period, providing support for the subsequent step C to locate the fault node of the complex system; including the following steps:
步骤B1:校准不同模式下的动态多层耦合网络;Step B1: Calibrate the dynamic multilayer coupling network in different modes;
步骤B2:计算不同模式下网络统计特征;Step B2: Calculate the statistical characteristics of the network under different modes;
步骤B3:量化故障指标;Step B3: Quantify the fault index;
步骤B4:提取故障发生时段的动态多层耦合网络切片;Step B4: Extract the dynamic multi-layer coupling network slice of the fault occurrence period;
其中,在步骤B1中所述的“校准不同模式下的动态多层耦合网络”,其具体做法如下:以执行的某故障激发测试用例为准,设某故障激发测试用例的开始执行的时刻为0时刻,校准不同模式下的动态多层耦合网络时间,并将执行某故障激发测试用例之前以及结束后的动态多层耦合网络数据剔除,仅保留与执行故障激发测试用例相关的动态多层耦合网络;Among them, the "calibration of dynamic multi-layer coupling networks in different modes" described in step B1 is as follows: take the execution of a fault excitation test case as the criterion, and set the start execution time of a fault excitation test case as At time 0, the dynamic multi-layer coupling network time in different modes is calibrated, and the dynamic multi-layer coupling network data before and after the execution of a fault excitation test case is eliminated, and only the dynamic multi-layer coupling related to the execution of the fault excitation test case is retained. network;
其中,在步骤B2中所述的“网络统计特征”,其具体含义为:一组可以体现动态多层耦合网络不同时间间隔下的网络整体性质的统计指标,常见的指标有:网络规模、平均度、平均距离等。网络规模是在某时间间隔Tk下该网络中活跃的节点总数,可以衡量该时刻下复杂系统的工作强度;平均度是平均某时间间隔Tk下该网络内部所有节点的度,平均度可以衡量该时间间隔下复杂系统内部总体的连通程度的指标;平均距离是平均某时间间隔Tk下该网络内部任意两节点之间的距离均值,平均距离可以衡量该时间间隔下复杂系统内部的信息或者物质的传递效率。网络统计特征不局限于以上所举3例,因属于公知部分,不做赘述;Among them, the specific meaning of "network statistical characteristics" described in step B2 is: a set of statistical indicators that can reflect the overall nature of the network under different time intervals of the dynamic multi-layer coupled network. Common indicators are: network scale, average degree, average distance, etc. The network scale is the total number of active nodes in the network under a certain time interval Tk , which can measure the work intensity of the complex system at this moment; the average degree is the average degree of all nodes in the network under a certain time interval Tk , and the average degree can be An index to measure the overall connectivity of the complex system at this time interval; the average distance is the average distance between any two nodes in the network at a certain time interval Tk , and the average distance can measure the information inside the complex system at this time interval Or the transfer efficiency of matter. The statistical characteristics of the network are not limited to the three examples mentioned above, because they belong to the well-known part, they will not be described in detail;
其中,在步骤B2中所述的“计算不同模式下网络统计特征”,其具体做法如下:分别计算在正常模式与故障模式下动态多层耦合网络中各个时间间隔下静态多层耦合网络的网络规模、平均度、平均距离等统计特征及各类组合,并依时间序列保存计算结果;Among them, the specific method of "calculating the statistical characteristics of the network in different modes" described in step B2 is as follows: respectively calculating the network of the static multi-layer coupling network at each time interval in the dynamic multi-layer coupling network in the normal mode and the fault mode Statistical characteristics such as scale, average degree, average distance and various combinations, and save the calculation results in time series;
其中,在步骤B3中所述的“量化故障指标”,其具体做法如下:量化统计特征异常的标准,设时间间隔Tk下正常模式的某个统计特征的值为Pr,同一时间间隔下故障模式相同的统计特征的值为Pf,设故障指标当α1≥k1,即故障指标不小于某个设定的常数阈值时,认定在此时间间隔内发生故障,由于在故障模式下复杂系统并不是全时段都处于故障状态,且复杂系统具有故障发生后自我修复的特性,因此故障发生时段的总时长应不大于故障激发测试用例的总时长;Among them, the specific method of "quantifying the fault index" described in step B3 is as follows: quantifying the abnormality standard of statistical features, set the value of a certain statistical feature of the normal mode at the time interval T k as Pr , and at the same time interval The value of the statistical feature with the same failure mode is P f , and the failure index is set When α 1 ≥ k 1 , that is, the fault index is not less than a certain constant threshold, it is determined that a fault occurs within this time interval, because the complex system is not in a fault state all the time in the fault mode, and the complex system has The self-healing feature after a fault occurs, so the total duration of the fault occurrence period should not be greater than the total duration of the fault triggering test case;
其中,在步骤B4中所述的“提取故障发生时段的动态多层耦合网络切片”,其具体做法如下:依次对比正常模式与故障模式在各个时间间隔下动态多层耦合网络切片的统计特征组合,计算各个时间间隔下的故障指标,寻找故障模式相对于正常模式统计特征异常时间片段,并提取统计特征异常时间片段下的动态多层耦合网络切片,为后续的步骤C复杂系统故障细定位提供数据支撑。Among them, the specific method of "extracting the dynamic multi-layer coupling network slice of the fault occurrence period" described in step B4 is as follows: sequentially compare the statistical feature combination of the dynamic multi-layer coupling network slice at each time interval in the normal mode and the fault mode. , calculate the fault index at each time interval, find the abnormal time segment of the statistical feature of the fault mode relative to the normal mode, and extract the dynamic multi-layer coupling network slice under the abnormal time segment of the statistical feature, which provides the detailed fault location of the complex system in the subsequent step C. data support.
其中,步骤C中所述的“定位复杂系统故障节点”,其具体含义为:通过对步骤B4保存的故障发生时段动态多层耦合网络切片进行分析,找到故障发生的节点,实现复杂系统故障的精细定位,便于系统设计人员完善系统结构,提升复杂系统可靠性;包括以下步骤:Wherein, the specific meaning of "locating the complex system fault node" described in step C is: by analyzing the dynamic multi-layer coupling network slice of the fault occurrence period saved in step B4, the node where the fault occurs is found, and the complex system fault is realized. Fine positioning is convenient for system designers to improve the system structure and improve the reliability of complex systems; it includes the following steps:
步骤C1:计算故障发生时段节点统计特征;Step C1: Calculate the statistical characteristics of nodes during the fault occurrence period;
步骤C2:量化故障指标;Step C2: quantify the fault index;
步骤C3:提取故障节点,输出故障信息;Step C3: extract the faulty node and output the fault information;
步骤C4:仿真验证;Step C4: simulation verification;
其中,在步骤C1中所述的“节点统计特征”,其具体含义为:一组可以体现动态多层耦合网络在故障发生时段下的每个节点特征的统计指标,常见的指标有:度和介数;网络节点的度是一个可以有效反应某节点与网络中其他节点直接的调用交互状态的指标,其定义为在某时间间隔Tk下某节点vi与其他节点直接相连的节点总数,对于有向网络,分为出度与入度;网络节点的介数是一个可以反映某节点在整体网络中影响力与重要程度的指标,其定义某时间间隔Tk下的网络中所有最短路径中经过某节点vi的数目占最短路径总数的比例;Among them, the specific meaning of "node statistical characteristics" described in step C1 is: a set of statistical indicators that can reflect the characteristics of each node of the dynamic multi-layer coupled network during the fault occurrence period. Common indicators are: degree and Betweenness; the degree of a network node is an index that can effectively reflect the direct call interaction state between a node and other nodes in the network, which is defined as the total number of nodes that a node v i is directly connected to other nodes in a certain time interval Tk , For a directed network, it is divided into out-degree and in-degree; the betweenness of a network node is an index that can reflect the influence and importance of a node in the overall network, which defines all the shortest paths in the network under a certain time interval Tk The proportion of the number of passing through a node v i to the total number of shortest paths;
其中,在步骤C1中所述的“计算故障发生时段节点统计特征”,其具体做法如下:依次计算在故障发生时段的故障模式和正常模式动态多层耦合网络切片的度、介数等节点统计特征组合,并保存每个动态多层耦合网络切片的指标分布;Among them, the specific method of "calculating the statistical characteristics of nodes during the fault occurrence period" described in step C1 is as follows: successively calculate the degree, betweenness and other node statistics of the fault mode and normal mode dynamic multi-layer coupled network slice during the fault occurrence period. Feature combination, and save the index distribution of each dynamic multi-layer coupled network slice;
其中,在步骤C2中所述的“量化故障指标”,其具体做法如下:量化统计特征异常的标准,设时间间隔Tk下的切片中,某节点vi在正常模式的某个统计特征的值为Pr v,同一时间间隔下故障模式相同的统计特征的值为设故障指标当α2≥k2,即故障指标不小于某个设定的常数阈值时,认定在此时间间隔内该节点发生异常;Among them, the specific method of "quantifying the fault index" described in step C2 is as follows: quantifying the abnormality standard of statistical features, it is assumed that in the slice under the time interval Tk , a node v i is in the normal mode of a certain statistical feature The value is P r v , and the value of the statistical characteristics of the same failure mode in the same time interval is the value of Set failure indicators When α 2 ≥ k 2 , that is, the fault index is not less than a certain constant threshold, it is determined that the node is abnormal within this time interval;
其中,在步骤C3中所述的“提取故障节点,输出故障信息”,其具体做法如下:依次计算每个节点故障发生时段下所有动态耦合网络切片中的故障指标,并提取各个动态耦合网络切片中超出设定阈值的节点,被提取的节点集合即为故障节点,统计故障节点出现频次等故障报告;Among them, the specific method of "extracting the faulty node and outputting the fault information" described in step C3 is as follows: calculating the fault indicators in all the dynamic coupling network slices under the fault occurrence period of each node in turn, and extracting each dynamic coupling network slice If the node exceeds the set threshold, the extracted node set is the fault node, and the fault report such as the occurrence frequency of the fault node is counted;
其中,在步骤C4中所述的“仿真验证”,其具体做法如下:输入步骤C3提取的故障节点到该复杂系统动态多层耦合网络模型中,以故障频次为序,基于渗流理论针对该网络模型实施蓄意攻击,观察网络模型受到攻击后的故障效果是否与实际状态拟合,并依据仿真结果输出故障风险评估,便于系统设计人员优化系统薄弱节点。Among them, the "simulation verification" described in step C4 is as follows: input the fault node extracted in step C3 into the dynamic multi-layer coupled network model of the complex system, in the order of fault frequency, based on the seepage theory for the network. The model implements a deliberate attack, observes whether the failure effect of the network model after the attack matches the actual state, and outputs the failure risk assessment according to the simulation results, which is convenient for system designers to optimize the weak nodes of the system.
通过以上步骤,本方法提出了一种基于动态多层耦合网络的复杂系统故障定位方法,解决了传统可靠性分析手段对于复杂系统的动态传播、级联失效、多元故障、不确定性等特性下故障定位任务时存在的局限性,基于复杂网络理论实现了复杂系统的故障定位分析。该复杂系统故障定位方法可快速地挖掘故障节点,且经过仿真分析双重验证,具有良好的实际应用价值。Through the above steps, this method proposes a fault location method for complex systems based on dynamic multi-layer coupled networks, which solves the problems of traditional reliability analysis methods for the dynamic propagation, cascading failures, multiple faults, uncertainties and other characteristics of complex systems. Due to the limitations of the fault location task, based on the complex network theory, the fault location analysis of the complex system is realized. The complex system fault location method can quickly find faulty nodes, and has been double verified by simulation analysis, which has good practical application value.
(三)优点创新(3) Merit innovation
本发明具有如下的创新点:The present invention has the following innovations:
1.定位迅速:本发明中所采用的基于动态多层耦合网络的复杂系统故障定位方法,首先分析可以体现网络整体性质的统计指标,实现定位复杂系统的故障时段,接着分析故障时段下动态多层耦合网络切片中每个节点统计指标,实现故障节点的精细定位,避免了因计算机算力瓶颈导致的传统故障定位方法定位慢的缺点;1. Rapid positioning: The method for locating complex system faults based on dynamic multi-layer coupled networks adopted in the present invention first analyzes the statistical indicators that can reflect the overall nature of the network, so as to locate the fault period of the complex system, and then analyzes the dynamic fault period under the fault period. The statistical indicators of each node in the layer-coupling network slice realize the fine location of the faulty node, avoiding the disadvantage of slow location of the traditional fault location method caused by the bottleneck of computer computing power;
2.移植方便:本发明所设计的复杂系统故障定位方法适用于多类复杂系统,应用于不同类型的复杂系统时,只需依据需故障定位的复杂系统特性对步骤A“构建多模式下的动态多层耦合网络”进行适配,即可快速实现本专利所提出的故障定位方法的移植,无需进行大规模修改,构建方法移植性强,减少设计人员的工作量;2. Easy to transplant: The complex system fault location method designed by the present invention is suitable for multiple types of complex systems. When applied to different types of complex systems, it is only necessary to perform step A "Building a multi-mode fault location" according to the characteristics of the complex system requiring fault location. “Dynamic multi-layer coupling network” can be adapted to quickly realize the transplantation of the fault location method proposed in this patent, without large-scale modification, the construction method is highly portable, and reduces the workload of designers;
3.方法先进:依据本发明所提出的基于动态多层耦合网络的复杂系统故障定位方法,采用复杂网络的统计分析方法可解决传统可靠性分析手段难以实现对复杂系统的动态传播、级联失效、多元故障等复杂故障进行定位的问题,具有一定的方法先进性。3. The method is advanced: according to the complex system fault location method based on the dynamic multi-layer coupling network proposed by the present invention, the statistical analysis method of the complex network can solve the problem that the traditional reliability analysis method is difficult to realize the dynamic propagation and cascading failure of the complex system. The problem of locating complex faults such as multiple faults and multiple faults has certain advanced methods.
综上,这种基于动态多层耦合网络的复杂系统故障定位方法为工程应用中的复杂系统故障定位提供一种很好的解决方案。In conclusion, this complex system fault location method based on dynamic multi-layer coupled network provides a good solution for complex system fault location in engineering applications.
附图说明Description of drawings
图1是本发明所述方法流程图。Figure 1 is a flow chart of the method of the present invention.
具体实施方式Detailed ways
为使本发明要解决的技术问题、技术方案更加清楚,下面将结合附图及具体实施案例进行详细描述。应当理解,此处所描述的实施实例仅用于说明和解释本发明,并不用于限定本发明。In order to make the technical problems and technical solutions to be solved by the present invention clearer, the following will describe in detail with reference to the accompanying drawings and specific implementation cases. It should be understood that the embodiments described herein are only used to illustrate and explain the present invention, but not to limit the present invention.
本发明的目的在于解决传统可靠性分析手段常应用于单故障源头、静态分析的场景,难以对复杂系统的动态传播、级联失效、多元故障等复杂故障进行定位的问题。这里我们提出一种针对复杂系统故障的复杂故障机理实现高效定位的故障定位方法,该方法无需计算对比所有节点特征信息,可较少占用计算机算力的前提下快速地定位复杂系统的故障节点,具有良好的实际应用价值。The purpose of the present invention is to solve the problem that the traditional reliability analysis method is often used in the scenario of single fault source and static analysis, and it is difficult to locate complex faults such as dynamic propagation, cascading failure, and multiple faults of complex systems. Here we propose a fault location method that can efficiently locate the complex fault mechanism of complex system faults. This method does not need to calculate and compare the characteristic information of all nodes, and can quickly locate the faulty nodes of complex systems under the premise of less computer computing power. Has good practical application value.
下面结合附图说明及具体实施方式对本发明进一步说明。The present invention will be further described below with reference to the accompanying drawings and specific embodiments.
本发明实施例以对某型嵌入式开发平台故障定位为例,阐述本发明方法。具体地说,该嵌入式开发平台的底层软件层存在未知驱动故障,当对该嵌入式开发平台施加高强度计算负载、系统硬件层陷入算力瓶颈时,由于底层软件驱动异常导致该型嵌入式计算机系统长时间卡顿无法完成既定任务,故需要基于动态多层耦合网络对该嵌入式开发平台实施故障定位分析,便于系统设计人员提升系统的可靠性等性能。The embodiments of the present invention illustrate the method of the present invention by taking the fault location of a certain type of embedded development platform as an example. Specifically, there is an unknown driver fault in the underlying software layer of the embedded development platform. When a high-intensity computing load is applied to the embedded development platform and the system hardware layer falls into a computing power bottleneck, the underlying software driver is abnormal. The computer system is stuck for a long time and cannot complete the set task, so it is necessary to perform fault location analysis on the embedded development platform based on the dynamic multi-layer coupling network, which is convenient for system designers to improve the reliability and other performance of the system.
为了实现上述目的,本发明的方法所采用的技术方案是:一种针对复杂系统故障的复杂故障机理实现高效定位的故障定位方法。其流程如图1所示:In order to achieve the above object, the technical solution adopted by the method of the present invention is: a fault location method for realizing efficient location of complex fault mechanisms of complex system faults. Its process is shown in Figure 1:
步骤A:构建多模式下的动态多层耦合网络;Step A: Construct a dynamic multi-layer coupling network under multi-mode;
步骤B:定位复杂系统故障时段;Step B: locate the fault period of the complex system;
步骤C:定位复杂系统故障节点。Step C: Locate the faulty node of the complex system.
其中,步骤A中所述的“动态多层耦合网络”,其具体含义为:可以抽象反映复杂系统内部动态运行状态的动态网络模型,在具体实施方式中为可以刻画某型嵌入式开发平台执行故障激发测试用例后系统内部运行状态的网络模型,动态网络模型为一组基于时间切片的静态网络,为了更真实描述系统状态,动态网络模型为包含多层节点、同层连边以及耦合连边的多层耦合网络;所述的“构建多模式下的动态多层耦合网络”,包括以下步骤:Wherein, the "dynamic multi-layer coupling network" described in step A, its specific meaning is: a dynamic network model that can abstractly reflect the internal dynamic operating state of a complex system, in the specific implementation mode, can describe a certain type of embedded development platform execution The network model of the internal operating state of the system after the fault triggers the test case. The dynamic network model is a set of static networks based on time slices. In order to describe the system state more realistically, the dynamic network model includes multi-layer nodes, same-layer edges and coupled edges. The multi-layer coupling network; the "Constructing a dynamic multi-layer coupling network under multi-mode", including the following steps:
步骤A1:量化复杂系统的正常模式与故障模式;Step A1: Quantify the normal mode and failure mode of the complex system;
步骤A2:构建正常模式的动态多层耦合网络;Step A2: construct a dynamic multi-layer coupling network of normal mode;
步骤A3:构建故障模式的动态多层耦合网络;Step A3: Build a dynamic multi-layer coupling network of failure modes;
其中,在步骤A1中所述的“正常模式”,其具体含义为:某型嵌入式开发平台在执行故障激发测试用例后,某型嵌入式开发平台处于正常运行的表现形式;Wherein, the "normal mode" described in step A1 has a specific meaning: after a certain type of embedded development platform executes the fault-stimulated test case, a certain type of embedded development platform is in the form of normal operation;
其中,在步骤A1中所述的“故障模式”,其具体含义为:某型嵌入式开发平台在执行故障激发测试用例后,某型嵌入式开发平台处于故障运行的表现形式;Wherein, the specific meaning of the "failure mode" described in step A1 is: after a certain type of embedded development platform executes the fault-stimulated test case, a certain type of embedded development platform is in the form of failure operation;
其中,在步骤A1中所述的“量化复杂系统的正常模式与故障模式”,其具体做法如下:量化某型嵌入式开发平台状态监控工具对于正常模式与故障模式相对应的性能指标,即规范不同故障现象对应的监测指标,以及系统正常运行时的监测指标,对于该嵌入式开发平台施加的测试用例下的故障指标可以设为系统对于测试用例的反应延迟超过10s,认定系统发生阻塞故障;Among them, the specific method of "quantifying the normal mode and failure mode of a complex system" described in step A1 is as follows: quantify the performance indicators corresponding to the normal mode and the failure mode of a certain type of embedded development platform state monitoring tool, that is, the specification The monitoring indicators corresponding to different fault phenomena and the monitoring indicators when the system is running normally, the fault indicators under the test cases imposed by the embedded development platform can be set as the system's response delay to the test cases exceeds 10s, and the system is determined to have a blocking fault;
其中,在步骤A2中所述的“构建正常模式的动态多层耦合网络”,其具体做法如下:根据某型嵌入式开发平台的特性,选择合适的动态运行数据采样工具,执行故障激发测试用例,在线采集在该故障激发测试用例下正常运行时的系统动态运行数据,然后基于复杂网络理论,抽取运行数据内部的节点交互关系,离线构建可以映射某型嵌入式开发平台正常模式运行机理的动态多层耦合网络;Among them, the specific method of "building a dynamic multi-layer coupling network in normal mode" described in step A2 is as follows: according to the characteristics of a certain type of embedded development platform, select an appropriate dynamic operation data sampling tool, and execute the fault excitation test case , collect the dynamic operation data of the system under the normal operation of the fault excitation test case online, and then based on the complex network theory, extract the node interaction inside the operation data, and construct offline to map the dynamic operation mechanism of a certain type of embedded development platform in the normal mode. Multilayer coupling network;
其中,在步骤A3中所述的“构建故障模式的动态多层耦合网络”,其具体做法如下:根据某型嵌入式开发平台的特性,选择合适的动态运行数据采样工具,执行故障激发测试用例,在线采集在该故障激发测试用例下故障运行时的系统动态运行数据,然后基于复杂网络理论,抽取运行数据内部的节点交互关系,离线构建可以映射某型嵌入式开发平台正常模式运行机理的动态多层耦合网络;Among them, the specific method of "building a dynamic multi-layer coupling network of failure mode" described in step A3 is as follows: according to the characteristics of a certain type of embedded development platform, select an appropriate dynamic operation data sampling tool, and execute the fault excitation test case , collect the dynamic operation data of the system during the fault operation under the fault excitation test case online, and then extract the node interaction relationship within the operation data based on the complex network theory, and build the offline construction that can map the dynamic operation mechanism of a certain type of embedded development platform in the normal mode. Multilayer coupling network;
其中,在步骤A2和步骤A3中所述的“动态多层耦合网络”,其具体含义为:一组以时间为序的静态多层耦合网络的集合,每个静态多层耦合网络对应着一个动态运行数据的时间切片,经过分割后的动态运行数据共有2210片,动态多层耦合网络中应包含2210组静态多层耦合网络;由于节点间调用关系特别是耦合层的节点间调用关系存在方向性,故动态耦合网络的连边为有向连边,动态耦合网络为有向网络。Among them, the "dynamic multi-layer coupling network" described in step A2 and step A3, its specific meaning is: a set of static multi-layer coupling networks in time order, each static multi-layer coupling network corresponds to a The time slice of dynamic operation data, there are 2210 pieces of dynamic operation data after segmentation, and the dynamic multi-layer coupling network should contain 2,210 groups of static multi-layer coupling networks; due to the call relationship between nodes, especially the call relationship between nodes in the coupling layer, there is a direction Therefore, the connected edges of the dynamic coupling network are directed edges, and the dynamic coupling network is a directed network.
其中,步骤B中所述的“定位复杂系统故障时段”,其具体含义为:通过对比步骤A构建的正常模式与故障模式下的动态多层耦合网络统计特征,发现故障发生的时段,并分别提取故障时段下故障模式与正常模式的动态多层耦合网络切片,为后续的步骤C的定位复杂系统故障节点提供支撑;包括以下步骤:Wherein, the specific meaning of "locating the fault period of a complex system" described in step B is: by comparing the statistical characteristics of the dynamic multi-layer coupling network under the normal mode constructed in step A and the fault mode, find the period of fault occurrence, and respectively Extracting the dynamic multi-layer coupling network slice of the fault mode and the normal mode under the fault period, providing support for the subsequent step C to locate the fault node of the complex system; including the following steps:
步骤B1:校准不同模式下的动态多层耦合网络;Step B1: Calibrate the dynamic multilayer coupling network in different modes;
步骤B2:计算不同模式下网络统计特征;Step B2: Calculate the statistical characteristics of the network under different modes;
步骤B3:量化故障指标;Step B3: Quantify the fault index;
步骤B4:提取故障发生时段的动态多层耦合网络切片;Step B4: Extract the dynamic multi-layer coupling network slice of the fault occurrence period;
其中,在步骤B1中所述的“校准不同模式下的动态多层耦合网络”,其具体做法如下:以执行的故障激发测试用例为准,设故障激发测试用例的开始执行的时刻为0时刻,校准不同模式下的动态多层耦合网络时间,并将执行故障激发测试用例之前以及结束后的动态多层耦合网络数据剔除,仅保留与执行故障激发测试用例相关的动态多层耦合网络,经过处理后原有的2210片切片仅保留其中的1950片;Among them, the "calibration of dynamic multi-layer coupling networks in different modes" described in step B1 is as follows: The execution of the fault excitation test case shall prevail, and the moment when the execution of the fault excitation test case starts is set to be time 0. , calibrate the dynamic multi-layer coupling network time in different modes, and remove the dynamic multi-layer coupling network data before and after the execution of the fault excitation test case, and only retain the dynamic multi-layer coupling network related to the execution of the fault excitation test case. After processing, only 1950 of the original 2210 slices remained;
其中,在步骤B2中所述的“网络统计特征”,其具体含义为:一组可以体现动态多层耦合网络不同时间间隔下的网络整体性质的统计指标,常见的指标有:网络规模、平均度、平均距离等,对于具体实施方式,需要计算某型嵌入式开发平台的动态多层耦合网络的网络规模、平均度和平均距离三个网络统计特征;Among them, the specific meaning of "network statistical characteristics" described in step B2 is: a set of statistical indicators that can reflect the overall nature of the network under different time intervals of the dynamic multi-layer coupled network. Common indicators are: network scale, average degree, average distance, etc. For the specific implementation, it is necessary to calculate the network scale, average degree and average distance of the dynamic multi-layer coupling network of a certain type of embedded development platform. Three network statistical characteristics;
其中,在步骤B2中所述的“计算不同模式下网络统计特征”,其具体做法如下:分别计算在正常模式与故障模式下动态多层耦合网络中各个时间间隔下静态多层耦合网络的网络规模、平均度、平均距离等统计特征的组合,并依时间序列保存计算结果;Among them, the specific method of "calculating the statistical characteristics of the network in different modes" described in step B2 is as follows: respectively calculating the network of the static multi-layer coupling network at each time interval in the dynamic multi-layer coupling network in the normal mode and the fault mode The combination of statistical features such as scale, average degree, and average distance, and the calculation results are saved in time series;
其中,在步骤B3中所述的“量化故障指标”,其具体做法如下:量化统计特征异常的标准,设时间间隔Tk下正常模式的某个统计特征的值为Pr,同一时间间隔下故障模式的统计特征的值为Pf,设故障指标当α1≥k1,即故障指标不小于某个设定的常数阈值时,认定在此时间间隔内发生故障,对于具体实施方式,将常数设为k1=0.3;Among them, the specific method of "quantifying the fault index" described in step B3 is as follows: quantifying the abnormality standard of statistical features, set the value of a certain statistical feature of the normal mode at the time interval T k as Pr , and at the same time interval The value of the statistical feature of the failure mode is P f , and the failure index is set When α 1 ≥ k 1 , that is, the failure index is not less than a certain constant threshold, it is determined that a failure occurs within this time interval, and for a specific implementation, the constant is set as k 1 =0.3;
其中,在步骤B4中所述的“提取故障发生时段的动态多层耦合网络切片”,其具体做法如下:依次对比正常模式与故障模式在各个时间间隔下动态多层耦合网络切片的统计特征组合,计算各个时间间隔下的故障指标,寻找故障模式相对于正常模式统计特征异常时间片段,并提取统计特征异常时间片段下的动态多层耦合网络切片,对于两组不同模式,每种模式共1950片经过校准后的动态多层耦合网络经过步骤B4提取后,发现有故障模式下有42片符合提取标准,故提取故障模式下这42片动态多层耦合网络切片,以及与其时间对应的42片正常模式下的动态多层耦合网络切片,后续的步骤C复杂系统故障细定位提供数据支撑。Among them, the specific method of "extracting the dynamic multi-layer coupling network slice of the fault occurrence period" described in step B4 is as follows: sequentially compare the statistical feature combination of the dynamic multi-layer coupling network slice at each time interval in the normal mode and the fault mode. , calculate the fault index at each time interval, find the abnormal time segment of the statistical feature of the failure mode relative to the normal mode, and extract the dynamic multi-layer coupling network slice under the abnormal time segment of the statistical feature. For two groups of different modes, each mode has a total of 1950 After the calibrated dynamic multilayer coupling network is extracted in step B4, it is found that there are 42 slices in the fault mode that meet the extraction criteria, so the 42 slices of the dynamic multilayer coupling network in the fault mode are extracted, as well as the 42 slices corresponding to their time. Dynamic multi-layer coupled network slicing in normal mode, the subsequent step C provides data support for fine-grained fault location of complex systems.
其中,步骤C中所述的“定位复杂系统故障节点”,其具体含义为:通过对步骤B4保存的故障发生时段动态多层耦合网络切片进行分析,找到故障发生的节点,实现某型嵌入式开发平台故障的精细定位,便于系统设计人员完善系统结构,提升复杂系统可靠性;包括以下步骤:The specific meaning of "locating the complex system fault node" described in step C is: by analyzing the dynamic multi-layer coupling network slice of the fault occurrence period saved in step B4, find the node where the fault occurs, and realize a certain type of embedded The fine location of faults on the development platform is convenient for system designers to improve the system structure and improve the reliability of complex systems; it includes the following steps:
步骤C1:计算故障发生时段节点统计特征;Step C1: Calculate the statistical characteristics of nodes during the fault occurrence period;
步骤C2:量化故障指标;Step C2: quantify the fault index;
步骤C3:提取故障节点,输出故障信息;Step C3: extract the faulty node and output the fault information;
步骤C4:仿真验证;Step C4: simulation verification;
其中,在步骤C1中所述的“节点统计特征”,其具体含义为:一组可以体现动态多层耦合网络在故障发生时段下的每个节点特征的统计指标,常见的指标有:度和介数,对于具体实施方式,需要计算某型嵌入式开发平台的动态多层耦合网络切片的度和介数这两个节点统计特征;Among them, the specific meaning of "node statistical characteristics" described in step C1 is: a set of statistical indicators that can reflect the characteristics of each node of the dynamic multi-layer coupled network during the fault occurrence period. Common indicators are: degree and Betweenness, for the specific implementation, it is necessary to calculate the two statistical characteristics of the degree and betweenness of the dynamic multi-layer coupled network slice of a certain type of embedded development platform;
其中,在步骤C1中所述的“计算故障发生时段节点统计特征”,其具体做法如下:依次计算在故障发生时段的故障模式和正常模式动态多层耦合网络切片的度和介数节点统计特征组合,并保存每个动态多层耦合网络切片的指标分布;Among them, the specific method of "calculating the statistical characteristics of nodes during the fault occurrence period" described in step C1 is as follows: successively calculate the degree and betweenness node statistical characteristics of the fault mode and normal mode dynamic multi-layer coupled network slices during the fault occurrence period Combine and save the index distribution of each dynamic multi-layer coupled network slice;
其中,在步骤C2中所述的“量化故障指标”,其具体做法如下:量化统计特征异常的标准,设时间间隔Tk下的切片中,某节点vi在正常模式的某个统计特征的值为Pr v,同一时间间隔下故障模式相同的统计特征的值为设故障指标当α2≥k2,即故障指标不小于某个设定的常数阈值时,认定在此时间间隔内该节点发生异常,对于具体实施方式,将常数设为k2=0.25;Among them, the specific method of "quantifying the fault index" described in step C2 is as follows: quantifying the abnormality standard of statistical features, it is assumed that in the slice under the time interval Tk , a node v i is in the normal mode of a certain statistical feature The value is P r v , and the value of the statistical characteristics of the same failure mode in the same time interval is the value of Set failure indicators When α 2 ≥k 2 , that is, the fault index is not less than a certain constant threshold, it is determined that the node is abnormal within this time interval. For the specific implementation, the constant is set to k 2 =0.25;
其中,在步骤C3中所述的“提取故障节点,输出故障信息”,其具体做法如下:依次计算每个节点故障发生时段下所有动态耦合网络切片中的故障指标,并提取各个动态耦合网络切片中超出设定阈值的节点,被提取的节点集合即为故障节点,统计故障节点出现频次等故障报告;Among them, the specific method of "extracting the faulty node and outputting the fault information" described in step C3 is as follows: calculating the fault indicators in all the dynamic coupling network slices under the fault occurrence period of each node in turn, and extracting each dynamic coupling network slice If the node exceeds the set threshold, the extracted node set is the fault node, and the fault report such as the occurrence frequency of the fault node is counted;
其中,在步骤C4中所述的“仿真验证”,其具体做法如下:输入步骤C3提取的故障节点到该复杂系统动态多层耦合网络模型中,以故障频次为序,基于渗流理论针对该网络模型实施蓄意攻击,观察网络模型受到攻击后的故障效果与实际状态拟合,并依据仿真结果输出故障风险评估。Among them, the "simulation verification" described in step C4 is as follows: input the fault node extracted in step C3 into the dynamic multi-layer coupled network model of the complex system, in the order of fault frequency, based on the seepage theory for the network. The model implements a deliberate attack, observes the failure effect of the network model after being attacked and fits the actual state, and outputs a failure risk assessment based on the simulation results.
本发明未详细阐述部分属于本领域公知技术。The parts of the present invention that are not described in detail belong to the well-known technology in the art.
以上所述,仅为本发明部分具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本领域的人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。The above description is only a part of the specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person familiar with the art within the technical scope disclosed by the present invention can easily think of changes or substitutions. Included within the scope of protection of the present invention.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910405979.3A CN110213087B (en) | 2019-05-16 | 2019-05-16 | Complex system fault positioning method based on dynamic multilayer coupling network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910405979.3A CN110213087B (en) | 2019-05-16 | 2019-05-16 | Complex system fault positioning method based on dynamic multilayer coupling network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110213087A CN110213087A (en) | 2019-09-06 |
CN110213087B true CN110213087B (en) | 2020-08-25 |
Family
ID=67787380
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910405979.3A Active CN110213087B (en) | 2019-05-16 | 2019-05-16 | Complex system fault positioning method based on dynamic multilayer coupling network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110213087B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111092752B (en) * | 2019-11-27 | 2021-03-16 | 中盈优创资讯科技有限公司 | Fault positioning method and device spanning multiple network slices |
CN110991673B (en) * | 2019-12-09 | 2023-10-31 | 中国航空工业集团公司上海航空测控技术研究所 | Fault isolation and localization method for complex systems |
CN113259153B (en) * | 2021-04-20 | 2022-01-28 | 北京航空航天大学 | Multilayer coupling network robustness control method based on dynamic coupling node degree deviation |
CN118655781B (en) * | 2024-08-14 | 2024-11-15 | 山东济矿鲁能煤电股份有限公司阳城煤矿 | Logic operation verification system for intelligent lubrication system of coal mining equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106371422A (en) * | 2016-08-31 | 2017-02-01 | 北京航空航天大学 | Method for predicting key infrastructure fault propagation |
CN107769962A (en) * | 2017-09-19 | 2018-03-06 | 贵州电网有限责任公司 | A kind of communication network failure cascade venture influence analysis method of attack resistance |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101446827B (en) * | 2008-11-06 | 2011-06-22 | 西安交通大学 | Process fault analysis device and method for a process industry system |
CN104298593B (en) * | 2014-09-23 | 2017-04-26 | 北京航空航天大学 | SOA system reliability evaluation method based on complex network theory |
US20180262525A1 (en) * | 2017-03-09 | 2018-09-13 | General Electric Company | Multi-modal, multi-disciplinary feature discovery to detect cyber threats in electric power grid |
CN107703920B (en) * | 2017-10-25 | 2019-12-17 | 北京交通大学 | Fault detection method of train braking system based on multivariate time series |
CN108039987B (en) * | 2017-12-19 | 2020-09-22 | 北京航空航天大学 | Key infrastructure vulnerability assessment method based on multilayer coupling relation network |
CN108768745B (en) * | 2018-06-14 | 2021-08-20 | 北京航空航天大学 | A Cluster System Fragility Evaluation Method Based on Complex Networks |
CN109669866B (en) * | 2018-12-10 | 2021-04-30 | 北京航空航天大学 | Method for acquiring fault propagation path during software operation |
-
2019
- 2019-05-16 CN CN201910405979.3A patent/CN110213087B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106371422A (en) * | 2016-08-31 | 2017-02-01 | 北京航空航天大学 | Method for predicting key infrastructure fault propagation |
CN107769962A (en) * | 2017-09-19 | 2018-03-06 | 贵州电网有限责任公司 | A kind of communication network failure cascade venture influence analysis method of attack resistance |
Also Published As
Publication number | Publication date |
---|---|
CN110213087A (en) | 2019-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110213087B (en) | Complex system fault positioning method based on dynamic multilayer coupling network | |
CN111209131B (en) | Method and system for determining faults of heterogeneous system based on machine learning | |
CN113391943B (en) | A method and device for locating the root cause of microservice faults based on causal inference | |
CN116450399B (en) | Microservice system fault diagnosis and root cause location method | |
CN105550100A (en) | Method and system for automatic fault recovery of information system | |
CN105471647B (en) | A kind of power communication network fault positioning method | |
CN114896166B (en) | Scene library construction method, device, electronic device and storage medium | |
CN105678337B (en) | Information fusion method in intelligent substation fault diagnosis | |
CN109840371B (en) | A Time Series-Based Dynamic Multilayer Coupling Network Construction Method | |
CN114692875B (en) | Construction method of GIS knowledge graph for fault diagnosis | |
CN115309575A (en) | Micro-service fault diagnosis method, device and equipment based on graph convolution neural network | |
CN112379325A (en) | Fault diagnosis method and system for intelligent electric meter | |
CN111884859B (en) | Network fault diagnosis method and device and readable storage medium | |
CN115150255A (en) | Self-adaptive knowledge graph-based automatic root cause positioning method for application faults | |
CN104879295A (en) | Large complex system fault diagnosis method based on multilevel flow model and minimal cutset of fault tree | |
CN113093695A (en) | Data-driven SDN controller fault diagnosis system | |
US11983472B2 (en) | Method for identifying fragile lines in power grids based on electrical betweenness | |
CN112803587A (en) | Intelligent inspection method for state of automatic equipment based on diagnosis decision library | |
Zhang et al. | Root cause analysis of concurrent alarms based on random walk over anomaly propagation graph | |
CN117034149A (en) | Fault processing strategy determining method and device, electronic equipment and storage medium | |
CN107566193A (en) | Fuzzy fault Petri network and its network fault diagnosis method | |
Zhu et al. | CPU and network traffic anomaly detection method for cloud data center | |
CN107103134A (en) | Low-speed wireless sensor network testability analysis method based on Bayesian network | |
CN109558258B (en) | Method and device for positioning root fault of distributed system | |
CN117439899B (en) | Communication machine room inspection method and system based on big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |