WO2024093533A1 - 一种基于频繁项集推理的水电信号异常判断方法 - Google Patents
一种基于频繁项集推理的水电信号异常判断方法 Download PDFInfo
- Publication number
- WO2024093533A1 WO2024093533A1 PCT/CN2023/118155 CN2023118155W WO2024093533A1 WO 2024093533 A1 WO2024093533 A1 WO 2024093533A1 CN 2023118155 W CN2023118155 W CN 2023118155W WO 2024093533 A1 WO2024093533 A1 WO 2024093533A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- data
- relationship
- action
- frequent item
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000005856 abnormality Effects 0.000 title claims abstract description 14
- 230000009471 action Effects 0.000 claims abstract description 45
- 230000001364 causal effect Effects 0.000 claims abstract description 27
- 238000005065 mining Methods 0.000 claims abstract description 25
- 238000012544 monitoring process Methods 0.000 claims abstract description 25
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 230000002159 abnormal effect Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 238000001514 detection method Methods 0.000 abstract description 6
- 238000002372 labelling Methods 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 description 7
- 230000031068 symbiosis, encompassing mutualism through parasitism Effects 0.000 description 3
- 238000010977 unit operation Methods 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Definitions
- the invention relates to the technical field of hydropower monitoring, and in particular to a method for judging hydropower signal anomaly based on frequent item set reasoning, which is used for detecting anomalies of unit operation signals in a hydropower signal monitoring system.
- Signal monitoring of the hydropower signal monitoring system is the core work of operation and duty, and reliable and accurate signal monitoring is an important guarantee for the safe operation of the cascade power stations in the basin.
- most hydropower stations have fully realized monitoring and control with the "four remote" functions as the core.
- the hydropower stations transmit a large amount of telemetry, telesignaling and other information of each controlled hydropower station to the control end (central control or central control room) in real time, and are monitored manually 24 hours a day and take corresponding processing measures according to the level and content of the information.
- the research on big data technology for smart grids has received full attention from the power industry and related experts and researchers.
- the research on big data engineering application technology for hydropower stations is in its infancy.
- the intelligent monitoring of hydropower monitoring system signals faces the following challenges: (1) The amount of monitoring information is large, and the manual monitoring work is heavy. (2) The implicit knowledge of the signal is complex and difficult to analyze. (3) The alarm rate of invalid dynamic monitoring signals is high.
- the present invention proposes a system anomaly judgment method based on signal relationship, which uses the frequent pattern mining algorithm apriori and the sequence frequent pattern mining algorithm prefixSpan to extract the three relationships between signals: symbiosis, association, and causality. Then, based on the mined signal relationship, the real-time system log signal is detected online to determine the interval where the abnormal signal may exist.
- the purpose of the present invention is to provide a method for judging anomalies of hydropower signals based on frequent item set reasoning, so as to achieve the effect of detecting anomalies of unit operation signals in a hydropower signal monitoring system.
- the present invention is implemented by the following technical solution: a method for judging abnormality of hydropower signals based on frequent item set reasoning, comprising the following steps:
- Step S1 obtaining original data, extracting signal data and action data from the original data and performing preprocessing, and constructing a signal-action pair combining the signal data and the action data;
- Step S2 performing relationship mining on the signal-action pairs based on the apriori algorithm and the prefixSpan algorithm to obtain an associated relationship set, a symbiotic relationship set, and a causal relationship set;
- Step S3 detecting the signal stream composed of the newly arrived continuous signals according to the associated relationship set, the symbiotic relationship set and the causal relationship set, determining whether there is an anomaly in the signal stream, and marking the signal interval with the anomaly.
- step S1 signal data and motion data are extracted from the original data.
- the process of data preprocessing includes:
- the original data in step S1 includes a log file sample recorded when the monitoring system is running in the hydropower signal monitoring system.
- step S2 includes:
- the mined second-order frequent item sets are filtered according to the support parameter threshold and confidence parameter threshold of the corresponding relationship to obtain the final associated relationship set, symbiotic relationship set and causal relationship set.
- the apriori algorithm is used to mine the associated relationship and symbiotic relationship of sequences
- the second-order frequent item set La is filtered according to the support and confidence of the symbiotic relationship to obtain the symbiotic relationship set.
- the prefixSpan algorithm is used to mine the causal relationship of the sequence
- the second-order frequent item set Lp is filtered according to the causal relationship support and confidence to obtain the causal relationship set.
- step S3 includes:
- the number of data that needs to be processed is represented by N.
- N The number of data that needs to be processed.
- the same signal interval size and step size parameters are used to process these N data;
- Anomaly judgment is performed on each signal interval, and three values of symbiotic relationship anomaly number, associated relationship anomaly number and causal relationship anomaly number are set for each signal interval. These three values are initially set to 0;
- the present invention has the following advantages and beneficial effects:
- the present invention defines signal relationships from the perspective of frequent patterns and proposes a related
- the system uses a relationship mining method to obtain signal pairs corresponding to symbiosis, association and causal relationships. Then, based on the signal relationships obtained through mining, the incoming log signals are automatically judged as abnormal, thereby improving the efficiency of anomaly detection.
- FIG1 is a flow chart of a method for determining anomalies of hydropower signals based on frequent item set reasoning provided by the present invention.
- FIG2 is a flow chart of data preprocessing in a method for determining anomalies of hydropower signals based on frequent item set reasoning provided by the present invention.
- FIG3 is a flowchart of relationship mining in a method for determining anomalies of hydropower signals based on frequent item set reasoning provided by the present invention.
- FIG4 is a schematic diagram of a sample log file recorded during operation of the hydropower signal monitoring system provided by the present invention.
- FIG5 is a flowchart of the upper part of anomaly detection in a method for determining anomaly of hydropower signals based on frequent item set reasoning provided by the present invention.
- Embodiment 1 is a diagrammatic representation of Embodiment 1:
- FIG1 A method for determining anomaly of hydropower signals based on frequent item set reasoning in this embodiment is shown in FIG1 .
- the overall idea of the present invention is as follows:
- data preprocessing is performed to extract "signal" data and "action” data from the original data D, and construct signal-action pairs, that is, "signal + action” pairs.
- relationship mining is performed. Based on the preprocessed "signal + action" pairs, three types of relationships, namely symbiosis, association, and causality, are mined based on the apriori algorithm and prefixSpan algorithm.
- anomaly judgment is performed. Based on the three signal relationships obtained by mining, the arriving signal stream is detected to determine whether there are anomalies in the signal stream, and the signal interval (window) with anomalies is marked.
- Embodiment 2 is a diagrammatic representation of Embodiment 1:
- This embodiment is further optimized on the basis of embodiment 1.
- a log file sample recorded during system operation in the hydropower signal monitoring system is given, that is, the original data.
- the first column is the signal sequence
- the second column is the time when the signal is sent
- the third column and the fourth column are the logical name and Chinese name of the signal respectively
- the fifth column is the current action of the signal
- the sixth column is whether the signal is successfully issued
- the seventh column is which device the row of data is collected from, without actual business association.
- the signal is generated by each device in the hydropower signal monitoring system and is the device parameter that the monitoring personnel are concerned about; the action is the action of the device parameter.
- a signal will correspond to its corresponding action, that is, it indicates what changes or operations have occurred in a certain parameter of a certain device. Therefore, the present invention believes that it is necessary to combine the signal and its corresponding action for abnormal monitoring of the hydropower signal monitoring system. Therefore, during data preprocessing, the present invention will The signal and action of each line in the original log record are combined and processed, that is, the "signal action" pair. When detecting signal anomalies, the relationship between signals is learned based on the "signal action" and then used for anomaly determination.
- Embodiment 3 is a diagrammatic representation of Embodiment 3
- FIG. 2 it is a schematic diagram of the process of extracting signal data and action data from the original data and preprocessing them in the present invention, constructing data pairs combining signal data and action data and mining the relationship between the data pairs, which actually refers to mining the relationship between signal data and action data.
- the main goal of preprocessing is to construct a data pair combining signal data and action data, that is, a signal-action pair.
- signal logic name data the logic name corresponds to the Chinese name one by one, and the logic name is processed faster during program operation
- action data are retained.
- MVar reactive power regulation
- MVar reactive power regulation
- Embodiment 4 is a diagrammatic representation of Embodiment 4:
- the goal of the present invention is to detect abnormalities in the unit operation signals in the hydropower signal monitoring system and make abnormal judgments based on the logical relationship between the signals.
- the signal-action pairs are subjected to relationship mining to obtain the associated relationship set, the symbiotic relationship set and the causal relationship set.
- relationship mining To obtain the associated relationship set, the symbiotic relationship set and the causal relationship set.
- three signal relationships need to be defined: causal relationship, associated relationship and symbiotic relationship.
- the relevant definitions are as follows:
- Definition 3 Associated relationship: Given s tn , s tm , when s tn ⁇ S, there exists s tm ⁇ S, then s tn and s tm are in a associated relationship, denoted as s tn -s tm .
- Embodiment 5 is a diagrammatic representation of Embodiment 5:
- This embodiment is further optimized on the basis of any one of the above embodiments 1-4. After the relationship is defined, it is necessary to perform relationship mining on the signals in the system operation log.
- the present invention innovatively proposes to realize signal relationship mining through frequent sub-items.
- two classic algorithms are adopted: apriori algorithm and prefixSpan algorithm.
- the process of realizing signal relationship mining in the present invention is shown in FIG3 .
- the present invention gives the three relationship definitions between system signals in the hydropower unit for the first time. Then, the idea of mining signal relationships by mining frequent sub-items is proposed. Finally, the apriori algorithm and prefixSpan algorithm are used to assist the present invention in realizing signal relationship mining. Among them, according to the relationship definition, it is necessary to set the two indicators of support (sup) and confidence (conf).
- the indicator settings of the three relationships are as follows:
- Companion relationship Given two signals A and B, these two signals can form two frequent sub-items, namely [AB] and [BA].
- a and B To determine that A and B are in a companion relationship, that is, B is the companion of A, expressed as A-B, the following indicators must be met: support Support(A,B)>0.001, and at the same time [AB] confidence conf(A,B)>0.9 and [BA] confidence conf(B,A) ⁇ 0.3.
- the companion relationship does not require the time order of A and B, so the apriori algorithm is used to implement it.
- the calculation method of confidence and support is the same as the previous definition.
- Symbiotic relationship Given two signals A and B, these two signals can form two frequent sub-items, namely [AB] and [BA]. To judge whether A and B are in a symbiotic relationship, that is, A and B are associated with each other, it is expressed as The following indicators must be met: Support(A,B)>0.001, and [AB] confidence conf(A,B)>0.9 and [BA] confidence conf(B,A)>0.9. According to the relationship definition, the symbiotic relationship does not require the time sequence of A and B, and the apriori algorithm is also used to implement it. The calculation method of confidence and support is the same as the previous definition.
- M is the number of signals used when mining relationships.
- the present invention recommends at least 100,000 to achieve better results.
- a is the size of the divided signal interval (window).
- the present invention has found from experiments that setting 10 is more appropriate, that is, one signal interval (window) contains 10 signals. This setting is related to the interval at which the signal is generated. Setting 10 will probably cover signals of about 1 minute.
- b is the step size of the signal interval (window) sliding. The present invention has found from experiments that setting 2 is more appropriate.
- Embodiment 6 is a diagrammatic representation of Embodiment 6
- the relationship mining in the second step is performed on the continuous M data extracted from the data preprocessed in the first step.
- the abnormality judgment is to perform abnormality judgment on the new continuous signal, and N represents the number of data that needs to be processed currently.
- N represents the number of data that needs to be processed currently.
- each signal interval three values are set for each signal interval (window): the number of symbiotic relationship anomalies, the number of associated relationship anomalies, and the number of causal relationship anomalies. These three values are initially 0. For each signal interval (window), if a signal relationship is wrong in the signal interval (window), the anomaly number of the corresponding type of relationship is increased by 1.
- Wi is the window without abnormality detection
- Wwa-sb is the data divided by window size a and step size b for mining relationships
- Wdwa -sb is the data divided by window size a and step size b for abnormality detection.
- Parameters a and b are the signal interval (window) size and moving step size, and the settings are consistent with the previous ones.
- N is the number of data that needs to be processed currently.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Testing Or Calibration Of Command Recording Devices (AREA)
- Debugging And Monitoring (AREA)
Abstract
本发明涉及水电监控技术领域,公开了一种基于频繁项集推理的水电信号异常判断方法。本发明包括:获取原始数据,从原始数据中提取出信号数据和动作数据并进行预处理,构建信号数据和动作数据结合的信号动作对;基于apriori算法和prefixSpan算法对信号动作对进行关系挖掘,获取伴生关系集、共生关系集和因果关系集;根据伴生关系集、共生关系集和因果关系集对新到的连续的信号组成的信号流进行检测,判断信号流中是否存在异常,对存在异常的信号区间进行标注。本发明从频繁模式的角度定义信号关系,基于apriori和prefixSpan算法提出关系挖掘关系挖掘方法,得到共生,伴生和因果关系所对应的信号对,基于信号关系对到来的日志信号进行自动判断异常,提高异常检测效率。
Description
本发明涉及水电监控技术领域,具体地说,是一种基于频繁项集推理的水电信号异常判断方法,用于对水电信号监控系统中的机组运行信号进行异常检测。
水电信号监控系统信号监视是运行值班的核心工作,可靠、精准的信号监视是流域梯级电站群安全运行的重要保障。目前,大部分水电站已经全部实现以“四遥”功能为核心的监视控制,水电站实时向调控端(集控或中控室)传送各受控水电站遥测、遥信等大量信息,由人工24小时监视并根据信息级别、内容,采取相应的处理措施。
目前,针对智能电网的大数据技术研究得到了电力行业及相关专家学者的充分重视。然而,针对水电站的大数据工程应用技术研究却处于刚起步的阶段。现有模式下,水电监控系统信号智能监视存在以下挑战:(1)监控信息量大,人工监控工作繁重。(2)信号隐性知识复杂,分析难度大。(3)无效动态监控信号告警率高。
为了解决上述问题,本发明提出了一种基于信号关系的系统异常判断方法,利用频繁模式挖掘算法apriori和序列频繁模式挖掘算法prefixSpan,抽取了信号间的共生,伴生,因果三种关系。然后,基于挖掘到的信号关系对实时的系统日志信号进行在线检测,判断出可能存在异常信号的区间。
发明内容
本发明的目的在于提供一种基于频繁项集推理的水电信号异常判断方法,实现对水电信号监控系统中的机组运行信号进行异常检测的效果。
本发明通过下述技术方案实现:一种基于频繁项集推理的水电信号异常判断方法,包括以下步骤:
步骤S1,获取原始数据,从原始数据中提取出信号数据和动作数据并进行预处理,构建信号数据和动作数据结合的信号动作对;
步骤S2,基于apriori算法和prefixSpan算法对所述信号动作对进行关系挖掘,获取伴生关系集、共生关系集和因果关系集;
步骤S3,根据所述伴生关系集、共生关系集和因果关系集对新到的连续的信号组成的信号流进行检测,判断信号流中是否存在异常,对存在异常的信号区间进行标注。
为了更好地实现本发明,进一步地,所述步骤S1中从原始数据中提取出信号数据和动
作数据并进行预处理的过程包括:
对原始数据进行预处理;
判断是否还有未做预处理的数据,如果是,先去除动作数据中的数字数据,再合并信号数据中的信号逻辑名数据和去除数字数据后的动作数据组成信号逻辑名动作对;
添加信号逻辑名动作对至预处理后的数据中,构建信号数据和动作数据结合的信号动作对。
为了更好地实现本发明,进一步地,所述步骤S1中的原始数据包括水电信号监控系统中监控系统运行时记录的日志文件样例。
为了更好地实现本发明,进一步地,所述步骤S2包括:
定义信号区间以及因果关系、伴生关系和共生关系这三种信号关系;
在预处理后的数据中抽取连续M条数据,所述M条数据为M条信号逻辑名动作对;
对M条数据进行信号区间划分,使用apriori算法和prefixSpan算法对划分后的数据进行关系挖掘获取二阶频繁项集Lp和二阶频繁项集La;
从挖掘出的二阶频繁项集中按照对应关系的支持度参数阈值和置信度参数阈值进行过滤,得到最终的伴生关系集、共生关系集和因果关系集。
为了更好地实现本发明,进一步地,所述apriori算法用于挖掘序列的伴生关系和共生关系;
根据伴生关系支持度和置信度过滤二阶频繁项集La获取伴生关系集;
根据共生关系支持度和置信度过滤二阶频繁项集La获取共生关系集。
为了更好地实现本发明,进一步地,所述prefixSpan算法用于挖掘序列的因果关系;
根据因果关系支持度和置信度过滤二阶频繁项集Lp获取因果关系集。
为了更好地实现本发明,进一步地,所述步骤S3包括:
对新接收的连续信号进行异常判断;
对当前需要处理的数据条数用N表示,对于从信号区间大小为a,步长为b的数据上挖掘出来的关系,使用同样的信号区间大小和步长参数对这N条数据进行处理;
对每一个信号区间进行异常判断,为每一个信号区间设置共生关系异常数、伴生关系异常数和因果关系异常数三个数值,这三个数值初始设为0;
对每一个信号区间,若某条信号关系在信号区间上出错,则对相应种类关系的异常数数值加1。
本发明与现有技术相比,具有以下优点及有益效果:
本发明从频繁模式的角度定义信号关系,并基于apriori算法和prefixSpan算法提出关
系挖掘关系挖掘方法,得到共生,伴生和因果关系所对应的信号对,然后基于挖掘得到的信号关系,对到来的日志信号进行自动判断异常,提高异常检测效率。
本发明结合下面附图和实施例做进一步说明,本发明所有构思创新应视为所公开内容和本发明保护范围。
图1为本发明提供的一种基于频繁项集推理的水电信号异常判断方法的流程图。
图2为本发明提供的一种基于频繁项集推理的水电信号异常判断方法中数据预处理的流程图。
图3为本发明提供的一种基于频繁项集推理的水电信号异常判断方法中关系挖掘的流程图。
图4为本发明提供的水电信号监控系统运行时记录的日志文件样例示意图。
图5为本发明提供的一种基于频繁项集推理的水电信号异常判断方法中异常检测的上部分流程图。
为了更清楚地说明本发明实施例的技术方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,应当理解,所描述的实施例仅仅是本发明的一部分实施例,而不是全部的实施例,因此不应被看作是对保护范围的限定。基于本发明中的实施例,本领域普通技术工作人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
实施例1:
本实施例的一种基于频繁项集推理的水电信号异常判断方法,如图1所示,本发明的总体思路为:
首先进行数据预处理,从原始数据D中提取出“信号”数据和“动作”数据,构建信号动作对,即“信号+动作”对。
再进行关系挖掘,在经过预处理后的“信号+动作”对上,基于apriori算法和prefixSpan算法挖掘出共生,伴生,因果三种关系。
最后进行异常判断,基于挖掘得到的三种信号关系,对到达的信号流进行检测,判断信号流中是否存在异常,对存在异常的信号区间(窗口)予以标注。
需要说明的是,本发明中使用到的符号含义参见表1。
表1
实施例2:
本实施例在实施例1的基础上做进一步优化,在本实施例中,如图4所示,给出了水电信号监控系统中系统运行时记录的日志文件样例,即原始数据。其中第一列为信号序列,第二列为信号发出的时间,第三列和第四列分别为信号的逻辑名和中文名,第五列为信号当前动作,第六列为信号是否下达成功,第七列为该行数据采集自哪个设备,无实际业务关联关系。
在水电信号监控系统中,信号由水电信号监控系统中各个设备所产生,是监控人员所关注的设备参数;动作,是该设备参数的动作。一个信号会对应其相应的动作,即表示某个设备的某参数发生了什么变化或操作。因此,本发明认为,对水电信号监控系统进行异常监控需要把信号和其对应的动作结合起来进行判断。所以,在数据预处理时,本发明会
将原始日志记录中每一行的信号和动作合并处理,即“信号动作”对。在信号异常检测时,就根据“信号动作”学习信号之间的关系,然后用于异常判定。
本实施例的其他部分与实施例1相同,故不再赘述。
实施例3:
本实施例在上述实施例1或2的基础上做进一步优化,如图2所示是本发明中从原始数据中提取出信号数据和动作数据并进行预处理的过程示意图,构建信号数据和动作数据结合的数据对以及对数据对的关系进行挖掘,实际上指的是对信号数据和动作数据之间关系的挖掘。
如图4所示,比如:“3F:导叶开度>35%(大轴补气用)复归”和“3F:投入/退出中心孔补气电磁空气阀复归”之间的关系,对于“3F:导叶开度>35%(大轴补气用)复归”这个信号动作数据对,其中“3F:导叶开度>35%(大轴补气用)”是信号数据,“复归”是动作数据。同样的,对于“3F:投入/退出中心孔补气电磁空气阀复归”这个信号动作数据对,“3F:投入/退出中心孔补气电磁空气阀”是信号数据,“复归”是动作数据。
故预处理的主要目标是构建信号数据和动作数据结合的数据对,即信号动作对。对于每一行数据,只保留其信号逻辑名数据(逻辑名和中文名一一对应,而逻辑名在程序运行中处理速度更快)和动作数据,并且,由于对于只有数值不同的同种动作,应视为同一动作,如:无功功率调节(MVar):-35.44和无功功率调节(MVar):-27.70,我们删去动作数据中的数字。之后合并每一行数据的“信号逻辑名”数据和“动作”数据,得到信号逻辑名动作对,即“信号逻辑名+动作”对,从而得到预处理好的数据Dp。
本实施例的其他部分与上述实施例1或2相同,故不再赘述。
实施例4:
本实施例在上述实施例1-3任一项的基础上做进一步优化,在本实施例中,本发明的目标是对水电信号监控系统中的机组运行信号进行异常检测,根据信号之间的逻辑关系进行异常判断。基于apriori算法和prefixSpan算法对所述信号动作对进行关系挖掘,获取伴生关系集、共生关系集和因果关系集,首先需要定义三种信号关系:因果关系、伴生关系和共生关系。相关定义如下:
定义1窗口:用集合S={st1,st2,…,stn}表示,si为信号,ti为时间,满足t1<t2<…<tn。
定义2原因关系:给定stn,stm,当stn∈S,存在stm∈S,且满足tn<tm,则stn是stm的原因,记为stn→stm。
定义3伴生关系:给定stn,stm,当stn∈S,存在stm∈S,则stn与stm是伴生关系,记为stn-stm。
定义4共生关系:给定stn,stm,当stn∈S,存在stm∈S,并且当stm∈S,存在stn∈S,
则stn与stm是共生关系,记为
本实施例的其他部分与上述实施例1-3任一项相同,故不再赘述。
实施例5:
本实施例在上述实施例1-4任一项基础上做进一步优化,在定义了关系之后,就需要对系统运行日志中的信号进行关系挖掘。
本发明根据定义的信号关系,创新地提出通过频繁子项实现信号关系挖掘。为了达到此目的,采用了两个经典算法:apriori算法和prefixSpan算法。本发明实现信号关系挖掘的流程如图3所示。
该部分是本发明的核心,或主要创新之处。首先,本发明首次给出了水电机组中系统信号之间的三个关系定义。然后,提出通过挖掘频繁子项进行信号关系挖掘的思路。最后,采用apriori算法和prefixSpan算法辅助本发明实现信号关系挖掘。其中,根据关系定义,需要设定支持度(sup)和置信度(conf)这两个指标。三种关系的指标设置情况如下:
因果关系:给定A,B两个信号,这两个信号可以组成两种频繁子项,即[AB]和[BA],要判断A,B为因果关系,即A是B的原因,表示为A→B,则需满足以下指标:支持度Support(A,B)>0.001,同时[BA]置信度conf(A,B)>0.9且[BA]置信度conf(B,A)<0.3。由于这里需要根据A,B的先后顺序来计算置信度、支持度,所以采用prefixSpan算法实现,其中,
伴生关系:给定A,B两个信号,这两个信号可以组成两种频繁子项,即[AB]和[BA],要判断A,B为伴生关系,即B是A的伴生,表示为A-B,则需满足以下指标:支持度Support(A,B)>0.001,同时[AB]置信度conf(A,B)>0.9且[BA]置信度conf(B,A)<0.3。根据关系定义,伴生关系不要求A,B的先后时间顺序,所以采用apriori算法实现。置信度,支持度的计算方法与前面的定义一样。
共生关系:给定A,B两个信号,这两个信号可以组成两种频繁子项,即[AB]和[BA],要判断A,B为共生关系,即A与B相互伴生,表示为则需满足以下指标:支持度Support(A,B)>0.001,同时[AB]置信度conf(A,B)>0.9且[BA]置信度conf(B,A)>0.9。根据关系定义,共生关系不要求A,B的先后时间顺序,也采用apriori算法实现。置信度,支持度的计算方法与前面的定义一样。
此外,前面提到需要对这M条数据进行信号区间(窗口)划分,这里就会涉及三个参数,即M,a,b。M为挖掘关系时采用的信号数量,本发明建议至少10万条才能达到比较好的效果。a是划分的信号区间(窗口)大小,本发明从实验中发现设置10比较合适,即一个信号区间(窗口)包含10条信号。这个设置与信号产生的间隔相关,设置10,大概会囊括1分钟左右的信号。b是信号区间(窗口)滑动的步长,本发明从实验中发现,建议设置2比较合适。
本实施例的其他部分与上述实施例1-4任一项相同,故不再赘述。
实施例6:
本实施例在上述实施例1-5任一项基础上做进一步优化,第二步的关系挖掘是在第一步预处理后的数据中抽取的连续M条数据上进行。异常判断则是对新来的连续信号进行异常判断,用N表示当前需要处理的数据条数。对于从信号区间(窗口)大小为a,步长为b的数据上挖掘出来的关系,使用同样的信号区间(窗口)大小和步长参数对这N条数据进行处理。
为对每一个信号区间(窗口)进行异常判断。为每一个信号区间(窗口)设置三个数值:共生关系异常数、伴生关系异常数、因果关系异常数,这三个数值初始为0。对每一个信号区间(窗口),若某条信号关系在信号区间(窗口)上出错,则对相应种类关系的异常数数值加1。
对于一对信号关系是否在信号区间(窗口)内出错的具体判断如下:
对于一对共生关系若stn在信号区间(窗口)内时stm不在信号区间(窗口)内;或stm在信号区间(窗口)内,stn不在信号区间(窗口)内,则该关系在信号区间(窗口)内出错。
对于一对伴生关系stn-stm。当stn在信号区间(窗口)内时,若stm不在信号区间(窗口)内,则该关系在信号区间(窗口)内出错。
对于一对因果关系当stn在信号区间(窗口)内时,若stm不在信号区间(窗口)内或stm在信号区间(窗口)内但tm<tn,则该关系在信号区间(窗口)内出错。
异常判断流程如图5所示。其中,Wi为没进行异常检测的窗口;Wwa-sb为用于挖掘关系的以窗口大小为a,步长为b划分的数据;Wdwa-sb为用于异常检测的以窗口大小为a,步长为b划分的数据。
该部分也是本发明的核心,目前还没有相关发明方法通过本文提出的信号关系实现信号异常检测。参数a和b就是信号区间(窗口)大小和移动步长,设置与前面一致。N是当前需要处理的数据条数。
本实施例的其他部分与上述实施例1-5任一项相同,故不再赘述。
以上所述,仅是本发明的较佳实施例,并非对本发明做任何形式上的限制,凡是依据本发明的技术实质对以上实施例所做的任何简单修改、等同变化,均落入本发明的保护范围之内。
Claims (4)
- 一种基于频繁项集推理的水电信号异常判断方法,其特征在于,包括以下步骤:步骤S1,获取原始数据,从原始数据中提取出信号数据和动作数据并进行预处理,构建信号数据和动作数据结合的信号动作对;步骤S2,基于apriori算法和prefixSpan算法对所述信号动作对进行关系挖掘,获取伴生关系集、共生关系集和因果关系集;步骤S3,根据所述伴生关系集、共生关系集和因果关系集对新到的连续的信号组成的信号流进行检测,判断信号流中是否存在异常,对存在异常的信号区间进行标注;所述步骤S2包括:定义信号区间以及因果关系、伴生关系和共生关系这三种信号关系;在预处理后的数据中抽取连续M条数据,所述M条数据为M条信号逻辑名动作对;对M条数据进行信号区间划分,使用apriori算法和prefixSpan算法对划分后的数据进行关系挖掘获取二阶频繁项集Lp和二阶频繁项集La;从挖掘出的二阶频繁项集中按照对应关系的支持度参数阈值和置信度参数阈值进行过滤,得到最终的伴生关系集、共生关系集和因果关系集;所述apriori算法用于:挖掘序列的伴生关系和共生关系;根据伴生关系支持度和置信度过滤二阶频繁项集La获取伴生关系集;根据共生关系支持度和置信度过滤二阶频繁项集La获取共生关系集;所述prefixSpan算法用于:挖掘序列的因果关系;根据因果关系支持度和置信度过滤二阶频繁项集Lp获取因果关系集。
- 根据权利要求1所述的一种基于频繁项集推理的水电信号异常判断方法,其特征在于,所述步骤S1中从原始数据中提取出信号数据和动作数据并进行预处理的过程包括:对原始数据进行预处理;判断是否还有未做预处理的数据,如果是,先去除动作数据中的数字数据,再合并信号数据中的信号逻辑名数据和去除数字数据后的动作数据组成信号逻辑名动作对;添加信号逻辑名动作对至预处理后的数据中,构建信号数据和动作数据结合的信号动作对。
- 根据权利要求2所述的一种基于频繁项集推理的水电信号异常判断方法,其特征在于,所述步骤S1中的原始数据包括水电信号监控系统中监控系统运行时记录的日志文件样例。
- 根据权利要求1-3任一项所述的一种基于频繁项集推理的水电信号异常判断方法,其特征在于,所述步骤S3包括:对新接收的连续信号进行异常判断;对当前需要处理的数据条数用N表示,对于从信号区间大小为a,步长为b的数据上挖掘出来的关系,使用同样的信号区间大小和步长参数对这N条数据进行处理;对每一个信号区间进行异常判断,为每一个信号区间设置共生关系异常数、伴生关系异常数和因果关系异常数三个数值,这三个数值初始设为0;对每一个信号区间,若某条信号关系在信号区间上出错,则对相应种类关系的异常数数值加1。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211365483.6A CN115470831B (zh) | 2022-11-03 | 2022-11-03 | 一种基于频繁项集推理的水电信号异常判断方法 |
CN202211365483.6 | 2022-11-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024093533A1 true WO2024093533A1 (zh) | 2024-05-10 |
Family
ID=84338239
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/118155 WO2024093533A1 (zh) | 2022-11-03 | 2023-09-12 | 一种基于频繁项集推理的水电信号异常判断方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115470831B (zh) |
WO (1) | WO2024093533A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115470831B (zh) * | 2022-11-03 | 2023-04-18 | 四川中电启明星信息技术有限公司 | 一种基于频繁项集推理的水电信号异常判断方法 |
CN117648446A (zh) * | 2023-11-24 | 2024-03-05 | 国能大渡河流域水电开发有限公司 | 一种基于知识图谱的水电站监视事件异常识别与诊断方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170293670A1 (en) * | 2016-04-07 | 2017-10-12 | University Of Virginia Patent Foundation | Sequential pattern mining with the micron automata processor |
CN110245168A (zh) * | 2019-06-20 | 2019-09-17 | 国网江苏省电力有限公司南京供电分公司 | 一种提取电网历史告警中异常事件特征信号的方法及系统 |
CN112183656A (zh) * | 2020-10-12 | 2021-01-05 | 国网新疆电力有限公司 | 一种电网故障中scada数据频繁项集挖掘方法 |
CN112888008A (zh) * | 2021-01-08 | 2021-06-01 | 南京中兴力维软件有限公司 | 基站异常检测方法、装置、设备及存储介质 |
CN115470831A (zh) * | 2022-11-03 | 2022-12-13 | 四川中电启明星信息技术有限公司 | 一种基于频繁项集推理的水电信号异常判断方法 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1549171A (zh) * | 2003-05-15 | 2004-11-24 | 季永萍 | 基于网格计算的高新技术市场界定标准的实现装置 |
TW200926033A (en) * | 2007-07-18 | 2009-06-16 | Steven Kays | Adaptive electronic design |
CN101937447B (zh) * | 2010-06-07 | 2012-05-23 | 华为技术有限公司 | 一种告警关联规则挖掘方法、规则挖掘引擎及系统 |
CN102142992A (zh) * | 2011-01-11 | 2011-08-03 | 浪潮通信信息系统有限公司 | 通信告警频繁项集挖掘引擎及冗余处理方法 |
US10873371B1 (en) * | 2019-08-19 | 2020-12-22 | Cisco Technology, Inc. | Antenna for massive multiple input and multiple output (mMIMO) |
CN112667827A (zh) * | 2020-12-23 | 2021-04-16 | 北京奇艺世纪科技有限公司 | 一种数据异常分析方法、装置、电子设备及存储介质 |
CN112949874B (zh) * | 2021-03-04 | 2022-10-04 | 国网江苏省电力有限公司南京供电分公司 | 一种配电终端缺陷特征自诊断方法及系统 |
CN114064723A (zh) * | 2021-11-15 | 2022-02-18 | 中国南方电网有限责任公司超高压输电公司昆明局 | 关联规则挖掘方法、装置、计算机设备和存储介质 |
-
2022
- 2022-11-03 CN CN202211365483.6A patent/CN115470831B/zh active Active
-
2023
- 2023-09-12 WO PCT/CN2023/118155 patent/WO2024093533A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170293670A1 (en) * | 2016-04-07 | 2017-10-12 | University Of Virginia Patent Foundation | Sequential pattern mining with the micron automata processor |
CN110245168A (zh) * | 2019-06-20 | 2019-09-17 | 国网江苏省电力有限公司南京供电分公司 | 一种提取电网历史告警中异常事件特征信号的方法及系统 |
CN112183656A (zh) * | 2020-10-12 | 2021-01-05 | 国网新疆电力有限公司 | 一种电网故障中scada数据频繁项集挖掘方法 |
CN112888008A (zh) * | 2021-01-08 | 2021-06-01 | 南京中兴力维软件有限公司 | 基站异常检测方法、装置、设备及存储介质 |
CN115470831A (zh) * | 2022-11-03 | 2022-12-13 | 四川中电启明星信息技术有限公司 | 一种基于频繁项集推理的水电信号异常判断方法 |
Also Published As
Publication number | Publication date |
---|---|
CN115470831B (zh) | 2023-04-18 |
CN115470831A (zh) | 2022-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2024093533A1 (zh) | 一种基于频繁项集推理的水电信号异常判断方法 | |
CN106909698B (zh) | 一种除尘器运维诊断和滤袋实时寿命管理方法 | |
WO2016029570A1 (zh) | 一种面向电网调度的智能告警分析方法 | |
CN103198147B (zh) | 自动化监测异常数据的判别及处理方法 | |
CN106908671A (zh) | 一种非侵入式家用负荷智能检测方法及系统 | |
CN111024158A (zh) | 一种结合边缘计算的厨电危险智能监测方法 | |
CN106933097B (zh) | 一种基于多层优化pcc-sdg的化工过程故障诊断方法 | |
CN103729804A (zh) | 一种应对电能质量预警的在线决策支持方法 | |
CN116991130A (zh) | 一种石化生产智能化自动化控制系统和方法 | |
CN117077731A (zh) | 一种多设备协同的配电物联网异常检测方法和系统 | |
CN104298187A (zh) | 金湿法冶金全流程三层结构过程监测方法 | |
CN114004059B (zh) | 一种水轮发电机组健康画像方法 | |
CN110209649B (zh) | 基于关联规则知识库的中央空调系统能效实时诊断方法 | |
CN108804796B (zh) | 基于频谱特征的环冷机漏风率检测方法 | |
CN106950946B (zh) | 一种基于优化原则的湿法冶金异常控制方法 | |
CN103361454B (zh) | 基于数据过滤的高炉悬料判断方法 | |
CN112381130A (zh) | 一种基于聚类分析的配电房多元数据异常检测方法 | |
CN110207827B (zh) | 一种基于异常因子提取的电气设备温度实时预警方法 | |
CN108709426B (zh) | 基于频谱特征双边检测法烧结机漏风故障在线诊断方法 | |
CN111931969A (zh) | 一种基于时序分析的合并单元设备状态预测方法 | |
CN115526196A (zh) | 基于实时数据及模型的潜油电泵工况在线动态诊断方法 | |
Kezunovic et al. | Neural network applications to real-time and off-line fault analysis | |
CN114759558A (zh) | 非侵入式电动自行车充电负荷在线快速检测方法 | |
CN115392663A (zh) | 一种基于大数据的数据采集处理方法 | |
Lujiang | The Application of Data Mining in Fault Location of Computer Interlocking System in Railway Signling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23884465 Country of ref document: EP Kind code of ref document: A1 |