WO2024007580A1 - 一种基于混合聚类的电力设备并行故障诊断方法及装置 - Google Patents

一种基于混合聚类的电力设备并行故障诊断方法及装置 Download PDF

Info

Publication number
WO2024007580A1
WO2024007580A1 PCT/CN2023/074751 CN2023074751W WO2024007580A1 WO 2024007580 A1 WO2024007580 A1 WO 2024007580A1 CN 2023074751 W CN2023074751 W CN 2023074751W WO 2024007580 A1 WO2024007580 A1 WO 2024007580A1
Authority
WO
WIPO (PCT)
Prior art keywords
clustering
data
component
fault diagnosis
tuple
Prior art date
Application number
PCT/CN2023/074751
Other languages
English (en)
French (fr)
Inventor
刘少伟
戴必翔
秦昌嵩
董贝
经周
Original Assignee
南京国电南自电网自动化有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京国电南自电网自动化有限公司 filed Critical 南京国电南自电网自动化有限公司
Publication of WO2024007580A1 publication Critical patent/WO2024007580A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Definitions

  • the invention belongs to the field of multivariate data monitoring and diagnosis in the power grid and electric power industry, and relates to a parallel fault diagnosis method and device for electric equipment based on hybrid clustering.
  • Subtractive clustering algorithm and K-means algorithm belong to machine learning algorithms.
  • Machine learning algorithms can be divided into two types: supervised learning and unsupervised learning. In the real world, most samples are unlabeled, so unsupervised learning is more widely used than supervised learning.
  • the K-means algorithm is a typical unsupervised learning clustering algorithm. The selection of the initial cluster center is randomly initialized, so the accuracy of its clustering results is unstable.
  • the purpose of the present invention is to overcome the deficiencies in the existing technology and provide a parallel fault diagnosis method for electric power equipment based on hybrid clustering, which can complete parallel diagnosis of corresponding streaming data in real time, meet the fault diagnosis of monitoring data in real time, and discover in time Failure of electrical equipment.
  • the present invention provides a parallel fault diagnosis method for power equipment based on hybrid clustering, which includes the following steps:
  • PreBolt component uses the PreBolt component to receive Tuple tuples, and preprocess the data set in the Tuple tuples through the standard score method to obtain standardized samples;
  • the method for adaptively configuring the parallelism and number of related processes of each component in the Storm platform is:
  • the data set in the Tuple tuple is preprocessed through the standard score method to obtain standardized samples, including:
  • x ⁇ (x ⁇ [0,1]) is the normalized data value; x min is the minimum value of a certain dimension of data in the tuple data; x max is the maximum value of this dimension of data .
  • the method for constructing the fault diagnosis model includes:
  • fault diagnosis model to process standardized samples to obtain fault diagnosis results for power equipment, including:
  • the better initial clustering center obtained by the subtractive clustering process is used as the initial clustering center of the K-means algorithm, and then clustering is performed to achieve the fault diagnosis result of the sample data.
  • the standardized samples are used to determine the optimal initial clustering center through the subtractive clustering algorithm, including:
  • the SCMBolt component receives the tuples passed by the PreBolt component, performs subtractive clustering on the data in the tuples, and determines the clustering center through the density value.
  • the obtained clustering center is the point in the original data;
  • the initial clustering center is obtained, and the corresponding Id number and the standardized sample to be clustered corresponding to this number are encapsulated into a tuple and passed to the downstream component K-meansBolt.
  • Subtractive clustering methods include:
  • the sample dimension is M, and the number of sample points is n, which are (x 1 , x 2 ,..., x n ).
  • n which are (x 1 , x 2 ,..., x n ).
  • each sample point can be a candidate for the cluster center.
  • the density index of sample point xi is defined as
  • r a is a positive number.
  • the value of r a is a neighborhood radius of the point, and sample points outside the radius make a small contribution to the density index of the point.
  • the density index of each sample point is calculated, the sample point with the highest density index is selected as the first cluster center, x c1 is the selected point, and D c1 is the density index of this point. Then when selecting the next cluster center, the density index of each sample point x i can be corrected by the following formula.
  • r b is a positive number.
  • the better initial clustering center obtained by the subtractive clustering process is used as the initial clustering center of the K-means algorithm, and then clustering is performed, including:
  • the K-meansBolt component performs K-means clustering on the standardized samples to be clustered from the upstream SCMBolt component.
  • the clustering center sent from the upstream SCMBolt component is used as the initial clustering center of K-means clustering.
  • the cluster center is updated through iteration, and finally the relevant clustering results are obtained.
  • the clustering center transmitted from the upstream SCMBolt component is used as the initial clustering center of K-means clustering, and the clustering center is updated through iteration, including:
  • the model diagnosis results are stored in the database through DatabaseBolt, which facilitates the query and retrieval of diagnosis results in power and related industries, or the diagnosis results are stored in a data file through the FileBolt component, which can be flexibly copied and migrated.
  • the present invention provides a parallel fault diagnosis device for power equipment based on hybrid clustering, including:
  • the platform deployment module is used to build the Storm platform and deploy the machine learning network structure on the Storm platform to obtain a fault diagnosis model
  • the adaptive configuration module is used to adaptively configure the parallelism and number of related processes of each component in the storm platform based on historical power grid data;
  • the data access module is used to access real-time power grid data into the Spout source component of the storm platform through the IRichSpout interface to form a data stream to be processed;
  • the data encapsulation module is used to encapsulate the data stream to be processed into multiple Tuple tuples in chronological order and generate a unique ID for each Tuple tuple;
  • the preprocessing module is used to receive Tuple tuples using the PreBolt component, and preprocess the data sets in the Tuple tuples through the standard fraction method to obtain standardized samples;
  • the fault diagnosis module is used to process standardized samples using the fault diagnosis model to obtain fault diagnosis results of power equipment.
  • the present invention proposes a hybrid clustering structure based on the storm platform. That is, deployed upstream on the storm platform
  • the subtractive clustering algorithm component is used to determine the initial clustering center.
  • the clustering speed of this algorithm is very fast.
  • the obtained clustering center is the point in the original data, and the clustering centers are as far apart as possible, so as to maximize the accuracy of the clustering process. It avoids the subsequent clustering algorithm K-means from falling into the local optimum and reduces its iteration times, thus improving the accuracy and efficiency of classification.
  • the classification accuracy of this algorithm is higher than that of the conventional K-means algorithm, and it can classify the streaming data of power grid equipment more accurately.
  • the classification model is deployed on the Storm platform, and the diagnostic efficiency of fault processing is improved by adaptively configuring the number of processes, meta-components and parallelism of processing components.
  • This method of monitoring power equipment failure types can ensure the safe operation of power equipment, reduce losses to residents' production and life, and can detect various equipment failures early to avoid catastrophic accidents.
  • Figure 1 is a schematic diagram of the data processing process of the present invention.
  • Figure 2 is a data access flow chart.
  • This embodiment provides a parallel fault diagnosis method for power equipment based on hybrid clustering, which can complete parallel diagnosis of corresponding streaming data in real time.
  • the fault type of the sample data can be given more accurately.
  • the classification algorithm deployed on the Storm platform achieves high throughput and low-latency processing of streaming data by setting the number of tasks, the number of cluster nodes, the number of source components and processing components. It can satisfy the fault diagnosis of monitoring data in real time and detect the faults of power equipment in time.
  • This invention proposes an online parallel diagnosis method for power grid power equipment based on the storm platform. Mainly solve the following problems:
  • the method of the present invention includes the following steps:
  • the Spout component implements data access through the IRichSpout interface.
  • the accessed power grid feature vector data is a data stream without intervals. These feature vector data are continuously sent to the Spout source component to form a to-be-processed data flow.
  • Tuple is a tuple of data flow between components. Each Tuple tuple should encapsulate an appropriate amount of data.
  • each tuple encapsulates 1000 data, which is called a data set, that is, each tuple encapsulates a data set. This tuple is then sent to the pending queue.
  • each tuple sent is marked with a unique ID, that is, the corresponding data set in each tuple is marked.
  • Id indicates the position of the tuple or data set in the tuple in the data stream.
  • the downstream preprocessing component receives the tuple tuple sent by the upstream component Spout.
  • the sample to be diagnosed is encapsulated in the tuple.
  • Each tuple contains 1000 feature vector data.
  • the preprocessing component preprocesses the received feature vector data within the tuple. Taking the transformer fault diagnosis oil chromatography data as an example, the five gas contents of H 2 , CH 4 , C 2 H 6 , C 2 H 2 , and C 2 H 4 are selected as input and preprocessed. The distribution range of these data values is large. And there are also large differences between data of the same type. In order to reduce the impact of the difference in magnitude between them, before clustering the input feature quantities, they must be normalized according to the following formula. Normalization is also It can reduce the number of clustering subject iterations in the process of diagnosing the model and improve the accuracy of clustering.
  • the normalized values of each dimensional feature in the input feature vector set are used as input samples of the diagnostic model.
  • the DGA data input vector pattern is [x 1 , x 2 , x 3 , x 4 , x 5 ] T .
  • the downstream component receives the tuple passed from the preprocessing component PreBolt.
  • the data in this tuple is the data set corresponding to the corresponding Id number in the tuple passed to the PreBolt component.
  • Standardized deformations are called standardized samples to be classified. These samples participate in subsequent fault diagnosis processing and are finally summarized by number.
  • the downstream component After receiving the tuple from the preprocessing component PreBolt, the downstream component performs subtractive clustering processing on the standardized samples to be classified contained in the tuple, thereby obtaining a better initial clustering center.
  • the subtractive clustering process is encapsulated into a component, namely the SCMBolt component.
  • This component receives the tuples passed by the upstream component, clusters the data in the tuples, and determines the clustering center through the density value.
  • the clustering speed of this algorithm is very fast.
  • the obtained clustering center is the point in the original data, and each The clustering centers should be as far apart as possible to avoid the subsequent clustering algorithm from falling into local optimality to a large extent.
  • the subtractive clustering algorithm is completed, Obtain the initial clustering center, encapsulate it with the corresponding Id number and the standardized sample to be clustered corresponding to this number into a tuple, and pass it to the downstream component K-meansBolt.
  • Subtractive Clustering Method is a density clustering algorithm.
  • the sample dimension is M, and the number of sample points is n, which are (x 1 , x 2 ,..., x n ).
  • n which are (x 1 , x 2 ,..., x n ).
  • each sample point can be a candidate for the cluster center.
  • the density index of sample point xi is defined as
  • r a is a positive number.
  • the value of r a is a neighborhood radius of the point, and sample points outside the radius make a small contribution to the density index of the point.
  • the density index of each sample point is calculated, the sample point with the highest density index is selected as the first cluster center, x c1 is the selected point, and D c1 is the density index of this point. Then when selecting the next cluster center, the density index of each sample point x i can be corrected by the following formula.
  • the upstream component SCMBolt passes the tuples to the downstream component K-meansBolt.
  • the K-meansBolt component is the main part of the fault diagnosis model.
  • the component implements the hard clustering K-means algorithm internally, which is to cluster the standardized data sent from the upstream component SCMBolt.
  • the samples are clustered by K-means.
  • the clustering center transmitted from the upstream component is used as the initial clustering center of K-means clustering.
  • the algorithm updates the clustering center through iteration and finally obtains the relevant clustering. result.
  • the K-meansBolt component and the SCMBolt component are combined, and the overall clustering effect is compared to a single
  • the K-meansBolt component not only reduces the number of iterative runs of the K-means algorithm body, but also enhances its robustness when it comes to diagnosing the eigenvector data of power grid equipment.
  • d ij represents the Euclidean distance between point x i and point y j .
  • the coordinates of point x i are (x i1 , x i2 , x i3 ,..., x in ), and the coordinates of point y j are (y j1 , y j2 ,y j3 ,...,y jn ).
  • the Storm framework itself is not responsible for saving the calculation results.
  • the storage and summary of the calculation results can be completed by implementing Bolt, that is, they can be directly written to the data file or persistently stored in the database.
  • the result processing methods of the fault diagnosis model include DatabaseBolt and FileBolt.
  • DatabaseBolt implements the storage operation of model diagnosis results into the database, thereby facilitating the query and retrieval of diagnosis results in the power and related industries; while the FileBolt component stores the diagnosis results in a data file, which can be flexibly copied and migrated.
  • the classification model is deployed on the Storm platform, and the diagnostic efficiency of fault processing is improved by adaptively configuring the number of processes, meta-components and parallelism of processing components. Rate.
  • This invention can diagnose online faults of power data of power grid equipment in real time. By deploying a clustering algorithm on the storm platform, it can realize efficient classification processing of streaming data of power grid equipment.
  • this algorithm introduces a subtractive clustering algorithm to obtain the initial clustering center and avoid The subsequent clustering algorithm falls into a local optimum, thereby achieving accurate classification.
  • this method can monitor the types of power equipment faults, which can ensure the safe operation of power equipment, reduce losses to residents' production and life, and can detect various equipment faults early to avoid catastrophic accidents.
  • transformer fault diagnosis As an example.
  • transformer oil chromatographic data we select the contents of five gases dissolved in the oil: H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 (uL/L ) constitutes a feature vector. And these feature vector data are continuously sent to the fault diagnosis model on the storm platform, thereby realizing online diagnosis of the transformer.
  • the storm cloud platform which consists of a master node and several slave nodes.
  • five servers are used to form a physical cluster, and the servers are connected with Gigabit switches.
  • the online power grid monitoring data flow is simulated through historical data, and the historical data flow is larger than the formal data, so that the optimal number of processes, source components and logical processing components can be adaptively configured through throughput calculation.
  • the degree of concurrency can maximize the processing of subsequent formal power grid streaming data.
  • the formal power grid streaming data begins to be processed.
  • the first is data access.
  • the Spout source component connects to external data sources.
  • oil chromatography detection data before and after the failure of the same type of transformer in multiple engineering sites is usually selected when collecting data. These data include normal data and fault data, and are unlabeled samples.
  • the metadata is encapsulated into 1 tuple and sent to the pending queue for subsequent processing of the data. .
  • appropriate parallelism of source components can improve processing efficiency.
  • PreBolt processes the received tuples through normalization, which can also reduce clustering during the diagnosis of the model. The number of main iterations and improve the accuracy of clustering. PreBolt preprocesses the data and forms relevant data into new tuples, which are sent to the downstream component SCMBolt.
  • the entire classification module consists of two components, namely SCMBolt and K-meansBolt.
  • the SCMBolt component implements the subtractive clustering algorithm, receives tuples from the upstream component, clusters the data in the tuples, and determines the cluster center through the density value.
  • the clustering speed of this algorithm is very fast.
  • the obtained clustering centers are points in the original data, and the clustering centers are as far apart as possible, thus avoiding the subsequent clustering algorithm from falling into local optimality to a large extent.
  • the initial clustering center is obtained, and the corresponding Id number and the standardized sample to be clustered corresponding to this number are encapsulated into a tuple and passed to the downstream component K-meansBolt.
  • This component implements hard clustering K-means
  • the algorithm is to perform K-means clustering on the standardized samples to be clustered from the upstream component SCMBolt.
  • the initial clustering center is obtained from the upstream component SCMBolt, and then the clustering results are obtained through iterative calculations.
  • the combination of the K-meansBolt component and the SubBolt component has an overall clustering effect compared to a single K-meansBolt component.
  • the results are saved and summarized, and the diagnosis results are directly written into the data file or persisted in the database. That is, the result processing methods of the fault diagnosis model include DatabaseBolt and FileBolt as needed.
  • DatabaseBolt implements the storage operation of model diagnosis results into the database, thereby facilitating the query and retrieval of diagnosis results in the power and related industries; while the FileBolt component stores the diagnosis results in a data file, which can be flexibly copied and migrated.
  • This embodiment provides a parallel fault diagnosis system for power equipment based on hybrid clustering, including:
  • the platform deployment module is used to build the Storm platform and deploy the machine learning network structure on the Storm platform to obtain a fault diagnosis model
  • the adaptive configuration module is used to adaptively configure the parallelism and number of related processes of each component in the storm platform based on historical power grid data;
  • the data access module is used to access real-time power grid data into the Spout source component of the storm platform through the IRichSpout interface to form a data stream to be processed;
  • the data encapsulation module is used to encapsulate the data stream to be processed into multiple Tuple tuples in chronological order and generate a unique ID for each Tuple tuple;
  • the preprocessing module is used to receive Tuple tuples using the PreBolt component, and preprocess the data sets in the Tuple tuples through the standard fraction method to obtain standardized samples;
  • the fault diagnosis module is used to process standardized samples using the fault diagnosis model to obtain fault diagnosis results of power equipment.
  • the system of this embodiment can be used to implement the method described in Embodiment 1.
  • embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions means to implement a process or multiple flows in a flowchart Functions specified in a block or blocks of a process and/or block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

本发明提供一种基于混合聚类的电力设备并行故障诊断方法及装置,能够实时的完成相应流式数据的并行诊断,实时满足监测数据的故障诊断,及时发现电力设备的故障。本方法包括以下步骤:根据历史电网数据自适应配置storm平台中各个组件的并行度和相关进程数;通过IRichSpout接口将实时电网数据接入到storm平台的Spout源组件中,形成待处理数据流;按照时间顺序将待处理数据流封装到多个Tuple元组中,并为每个Tuple元组生成唯一的ID;利用PreBolt组件接收Tuple元组,并通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本;利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果。

Description

一种基于混合聚类的电力设备并行故障诊断方法及装置 技术领域
本发明属于电网电力行业多元数据监测诊断领域,涉及一种基于混合聚类的电力设备并行故障诊断方法及装置。
背景技术
随着电力系统的发展,电力设备故障对人们生活造成重大影响,因此亟需对设备的状态进行持续监测。而传感器技术和通信技术的不断进步,导致电网数据以指数级增长,同时这些数据呈实时性、易失性和无限性,是需要持续监测的流式数据。原有平台Hadoop可以处理批量数据,但实时性较差,而Storm是开源的分布式实时计算架构,可以快速处理海量数据流,弥补了Hadoop实时性处理的不足。
当前,随着Storm的兴起,其在电力行业领域出现一些应用成果。在Storm上实现基于时间的滑动窗口处理方法,并通过阈值判断实现电网数据流的异常检测。对电网设备中报警数据进行快速处理,通过聚类算法实现相关数据流的处理。
减法聚类算法和K-means算法属于机器学习算法,机器学习算法可以分为有监督学习和无监督学习两种。在现实世界中,大部分样本是不带标签的,因此无监督学习比监督学习应用更加广泛。K-means算法属于典型的无监督学习聚类算法,其初始聚类中心的选择为随机初始化,因此其聚类结果的准确性不稳定。
发明内容
本发明的目的在于克服现有技术中的不足,提供一种基于混合聚类的电力设备并行故障诊断方法,能够实时的完成相应流式数据的并行诊断,实时满足监测数据的故障诊断,及时发现电力设备的故障。
为达到上述目的,本发明是采用下述技术方案实现的:
第一方面,本发明提供了一种基于混合聚类的电力设备并行故障诊断方法,包括以下步骤:
根据历史电网数据自适应配置storm平台中各个组件的并行度和相关进程数;
通过IRichSpout接口将实时电网数据接入到storm平台的Spout源组件中,形成待处理数据流;
按照时间顺序将待处理数据流封装到多个Tuple元组中,并为每个Tuple元组生成唯一的ID;
利用PreBolt组件接收Tuple元组,并通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本;
利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果。
进一步的,自适应配置storm平台中各个组件的并行度和相关进程数的方法为:
利用历史电网数据模拟实时电网数据流,其中,历史电网数据的流量大于实时电网数据的预期流量;
根据历史电网数据计算storm平台中各个组件在不同并行度和不同进程数下的数据吞吐量;
在数据吞吐量满足预期吞吐量的情况下,自适应配置开销最低的组件并行 度和进程数。
进一步的,通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本,包括:
按照下述公式对其进行归一化,公式如下:
上式中,x`(x`∈[0,1])为归一化后的数据值;xmin为元组数据中某一维数据的最小值;xmax为这一维度数据的最大值。
进一步的,所述故障诊断模型的构建方法包括:
将减法聚类算法和K-means聚类算法分别部署SCMBolt组件和K-meansBolt组件中,将SCMBolt组件和K-meansBolt组件连接起来,设置组件的并行度,得到故障诊断模型。
进一步的,利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果,包括:
将标准化样本通过减法聚类算法确定较优的初始聚类中心;
将减法聚类处理所得的较优的初始聚类中心作为K-means算法的初始聚类中心,再进行聚类,从而实现该样本数据的故障诊断结果。
进一步的,将标准化样本通过减法聚类算法确定较优的初始聚类中心,包括:
SCMBolt组件接收PreBolt组件传递的元组,对元组中数据进行减法聚类,通过密度值确定聚类中心,得到的聚类中心为原数据中的点;
当减法聚类算法完成后,得到初始聚类中心,将其与相应Id编号及此编号对应的标准化待聚类样本封装为一个元组,传递给下游组件K-meansBolt。
减法聚类的方法包括:
样本维度为M,样本点个数为n,分别为(x1,x2,...,xn)。当维度较高时所有样本点归一到一个超立方体中。在此,每个样本点都可为聚类中心的候选者。则样本点xi的密度指标定义为
上式中,ra为一个正数。ra的取值为该点的一个邻域半径,而半径以外的样本点对该点的密度指标有很小的贡献。
当每一个样本点的密度指标计算完后,选择密度指标最高的样本点作为第一个聚类中心,xc1为选中的点,Dc1是此点的密度指标。则选择下一个聚类中心时,每个样本点xi的密度指标可通过下式修正。
上式中,rb为一个正数。
当修正完所有样本点的密度指标后,选择出新的聚类中心xc2,再次修正所有样本点的密度指标,不断的重复该过程,直到足够多的聚类中心出现,得到较优的初始聚类中心。
进一步的,将减法聚类处理所得的较优的初始聚类中心作为K-means算法的初始聚类中心,再进行聚类,包括:
K-meansBolt组件对上游SCMBolt组件传来的标准化待聚类样本进行K-means聚类,在聚类过程中将上游SCMBolt组件传来的聚类中心作为K-means聚类的初始聚类中心,通过迭代实现聚类中心的更新,最终得到相关聚类结果。
进一步的,在聚类过程中将上游SCMBolt组件传来的聚类中心作为K-means聚类的初始聚类中心,通过迭代实现聚类中心的更新,包括:
a)将上游SCMBolt组件传来的聚类中心作为K-means聚类的初始聚类中心。
b)计算样本集中所有样本到各个聚类中心的矢量距离,从中选择矢量距离最小的并将此样本划分到其对应的类中。
c)更新聚类中心,即计算每一类中所有样本数据的平均值,将这些均值作为k类别中新的聚类中心。
d)不断执行步骤b)和步骤c),直到新得到的聚类中心不再变化或与上次得 到的聚类中心相差的偏移值小于指定阈值,或算法执行的迭代次数达到指定要求,满足以上三个条件之一则停止聚类。
5)计算结果保存与汇总。
进一步的,计算结果保存与汇总,包括:
通过DatabaseBolt实现模型诊断结果到数据库的存储操作,从而便于电力及相关行业对诊断结果的查询与检索,或者通过FileBolt组件将诊断结果存储到数据文件中,此文件可以进行灵活的复制和迁移。
第二方面,本发明提供一种基于混合聚类的电力设备并行故障诊断装置,包括:
平台部署模块,用于搭建storm平台,并在storm平台部署机器学习网络结构,得到故障诊断模型;
自适应配置模块,用于根据历史电网数据自适应配置storm平台中各个组件的并行度和相关进程数;
数据接入模块,用于通过IRichSpout接口将实时电网数据接入到storm平台的Spout源组件中,形成待处理数据流;
数据封装模块,用于按照时间顺序将待处理数据流封装到多个Tuple元组中,并为每个Tuple元组生成唯一的ID;
预处理模块,用于利用PreBolt组件接收Tuple元组,并通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本;
故障诊断模块,用于利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果。
与现有技术相比,本发明所达到的有益效果:
1、本发明提出一种基于storm平台的混合聚类结构。即在storm平台上游部署 减法聚类算法组件,用来确定初始聚类中心,此算法的聚类速度很快,得到的聚类中心为原数据中的点,且各个聚类中心相距尽可能远,从而较大程度上避免后续聚类算法K-means陷入局部最优并减少其迭代次数,从而提高了分类的准确性和效率。该算法的分类准确率较常规K-means算法较高,能够对电网设备流式数据进行较为准确的分类。
2、适合于电力设备流式数据,因为这些数据基本上是无标签数据,而本算法为聚类算法,能够较好处理相关样本数据。
3、对电网设备流式数据处理的高效性,即将分类模型部署于storm平台上,通过自适应配置进程数、元组件和处理组件的并行度来提高故障处理的诊断效率。
4、该方法对电力设备故障类型的监测,能够保证电力设备的安全运行,减少对居民生产和生活所造成的损失,能够及早发现设备的各种故障,避免灾难性事故发生。
附图说明
图1是本发明数据处理过程示意图。
图2是数据接入流程图。
图3混合聚类算法的单机实现流程。
具体实施方式
下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。
实施例一:
本实施例提供一种基于混合聚类的电力设备并行故障诊断方法,能够实时的完成相应流式数据的并行诊断。能够较为准确给出样本数据的故障类型。并且在storm平台上部署的分类算法,通过设置任务数、集群节点数、源组件和处理组件的个数,从而实现流式数据的高吞吐量和低延时处理。能够实时满足监测数据的故障诊断,及时发现电力设备的故障。
本发明提出的基于storm平台的电网电力设备在线并行诊断方法。主要解决了以下问题:
(1)电网电力设备监测领域电力设备故障类型的监测,能够保证电力设备的安全运行,减少对居民生产和生活所造成的损失,能够需要及时对电力设备进行状态监测和故障诊断,从而及早发现设备的各种故障,避免灾难性事故发生。
(2)在电力系统大数据中,各种电力设备监测数据隐含着巨大的商业价值和社会价值,通过本方法对这些高价值数据的分类挖掘,从而可以得到更多有价值的东西。
(3)当电力设备处于极端恶劣环境下,如大雾、冰雨、风暴和雷暴等,电力设备由于监测值越限而频繁的向监测中心发送报警数据,从而在监测中心出现监测数据的井喷现象,现有平台对数据的接收和处理无法满足实际要求,实时性得不到满足,进而导致数据的丢失和覆盖。基于storm平台的在线并行故障诊断方法可以对井喷数据及时处理。
本发明的方法包括如下步骤:
1)数据源数据接入
Spout组件作为整个拓扑的源头,通过IRichSpout接口实现数据的接入,所接入的电网特征向量数据是没有间隔的数据流,这些特征向量数据被不断的发送到Spout源组件中,形成了待处理的数据流。Tuple为组件之间数据流的元组,每一个Tuple元组应封装适当数量的数据,这里每一个元组封装1000个数据,称为一个数据集,即每一个元组中封装一个数据集,之后将此元组发送到待处理队列。在后续处理过程中,为了便于处理诊断结果,保证处理元组的顺序性,对发送的每一个元组即对每一个元组中相应的数据集标记一个唯一的Id, Id表明了元组或元组中数据集在数据流中的位置。
2)样本数据标准化预处理
下游预处理组件接收到上游组件Spout发来的tuple元组,元组内封装要诊断的样本,每个元组内含有1000条特征向量数据。预处理组件对接收到的元组内特征向量数据进行预处理。以变压器故障诊断油色谱数据为例,选取H2、CH4、C2H6、C2H2、C2H4五种气体含量作为输入并进行预处理,这些数据值分布区间较大,且同类型数据之间差异也较大,为减少其相互之间量值差异所造成的影响,在对输入特征量聚类前,要按照下述公式对其进行归一化,归一化也可以减少诊断模型过程中聚类主体迭代的次数,并提高聚类的准确率。
公式如下:
上式中,x`(x`∈[0,1])为归一化后的数据值;xmin为元组数据中某一维数据的最小值;xmax为这一维度数据的最大值。分别将输入特征向量集中各维特征归一化后的值作为诊断模型的输入样本,DGA数据输入向量模式为[x1,x2,x3,x4,x5]T
当预处理组件中每个元组数据预处理完后,下游组件接收来自预处理组件PreBolt传递的元组,此元组中数据为传入PreBolt组件的元组中相应Id编号对应的数据集的标准化变形,称之为标准化待分类样本,这些样本参加后续的故障诊断处理,并按编号进行最终的汇总
3)较优的初始聚类中心选择
下游组件接收到预处理组件PreBolt发来的元组后,对该元组内包含的标准化待分类样本进行减法聚类处理,从而得到较优的初始聚类中心。
这里将减法聚类处理封装为一个组件,即SCMBolt组件。该组件接收上游组件传递的元组,对元组中数据进行聚类,通过密度值确定聚类中心,此算法的聚类速度很快,得到的聚类中心为原数据中的点,且各个聚类中心相距尽可能远,从而较大程度上避免后续聚类算法陷入局部最优,当减法聚类算法完成后, 得到初始聚类中心,将其与相应Id编号及此编号对应的标准化待聚类样本封装为一个元组,传递给下游组件K-meansBolt。
减法聚类理论基础如下:
减法聚类(Subtrative Clustering Method,SCM)是一种密度聚类算法。
样本维度为M,样本点个数为n,分别为(x1,x2,...,xn)。当维度较高时所有样本点归一到一个超立方体中。在此,每个样本点都可为聚类中心的候选者。则样本点xi的密度指标定义为
上式中,ra为一个正数。ra的取值为该点的一个邻域半径,而半径以外的样本点对该点的密度指标有很小的贡献。
当每一个样本点的密度指标计算完后,选择密度指标最高的样本点作为第一个聚类中心,xc1为选中的点,Dc1是此点的密度指标。则选择下一个聚类中心时,每个样本点xi的密度指标可通过下式修正。
上式中,rb为一个正数。从中可以看出,与第一个聚类中心xc1相靠近的样本点的密度指标明显减少,所以这些临近点成为新的聚类中心的可能性不大。而常数rb定义了一个邻域,其密度指标函数显著减小。通常rb大于ra,从而防止相聚很近的聚类中心的出现,一般情况下,取rb=1.5ra
当修正完所有样本点的密度指标后,选择出新的聚类中心xc2,再次修正所有样本点的密度指标。不断的重复该过程,直到足够多的聚类中心出现,当然也可根据条件自动确定聚类中心的数目。
4)标准化待分类样本分类处理
上游组件SCMBolt将元组传递给下游组件K-meansBolt,K-meansBolt组件为故障诊断模型的主体部分,组件内部实现了硬聚类K-means算法,即对上游组件SCMBolt传来的标准化待聚类样本进行K-means聚类,在此聚类过程中将上游组件传来的聚类中心作为K-means聚类的初始聚类中心,算法通过迭代实现聚类中心的更新,最终得到相关聚类结果。
K-meansBolt组件和SCMBolt组件结合,其总体聚类效果相比于单一的 K-meansBolt组件,在对电网设备的特征向量数据的诊断问题上,不仅减少了K-means算法主体的迭代运行次数,还增强了其鲁棒能力。
K-means原始算法的步骤如下:
a)从N个样本数据中随机选择k个不同的样本作为初始聚类中心。
b)计算样本集中所有样本到各个聚类中心的矢量距离,从中选择矢量距离最小的并将此样本划分到其对应的类中。K-means算法通常使用欧式距离来实现样本的归类问题。其公式如下:
dij表示点xi和点yj间的欧氏距离,xi点的坐标为(xi1,xi2,xi3,…,xin),yj点的坐标为(yj1,yj2,yj3,…,yjn)。
c)更新聚类中心,即计算每一类中所有样本数据的平均值,将这些均值作为k类别中新的聚类中心。
d)不断执行步骤b)和步骤c),直到新得到的聚类中心不再变化或与上次得到的聚类中心相差的偏移值小于指定阈值,或算法执行的迭代次数达到指定要求,满足以上三个条件之一则停止聚类。
5)计算结果保存与汇总
Storm框架本身不负责计算结果的保存,这里可以通过实现Bolt来完成计算结果的存储和汇总,即可以直接写入数据文件中,或持久化存储到数据库中。根据需要,故障诊断模型的结果处理方式有DatabaseBolt和FileBolt。DatabaseBolt实现模型诊断结果到数据库的存储操作,从而便于电力及相关行业对诊断结果的查询与检索;而FileBolt组件将诊断结果存储到数据文件中,此文件可以进行灵活的复制和迁移。
本方法具有以下特点和功能:
(1)该算法的分类准确率较常规K-means算法较高,能够对电网设备流式数据进行较为准确的分类。
(2)适合于电力设备流式数据,因为这些数据基本上是无标签数据,而本算法为聚类算法,能够较好处理相关样本数据。
(3)对电网设备流式数据处理的高效性,即将分类模型部署于storm平台上,通过自适应配置进程数、元组件和处理组件的并行度来提高故障处理的诊断效 率。
本发明能够实时诊断电网设备电力数据在线故障,通过在storm平台上部署聚类算法,从而实现电网设备流式数据的高效分类处理,另外本算法引入减法聚类算法来获取初始聚类中心,避免后续聚类算法陷入局部最优,进而实现准确分类。另外,该方法对电力设备故障类型的监测,能够保证电力设备的安全运行,减少对居民生产和生活所造成的损失,能够及早发现设备的各种故障,避免灾难性事故发生。
下面是本发明的一个优选实施案例,包含了采用本发明方法的电网设备故障在线诊断,它的特征、目的和优点可以从实施例的说明中看出。
这里以变压器故障诊断为例,通过采集变压器油色谱数据,选取油中溶解气体H2,CH4,C2H6,C2H4,C2H2这5种气体的含量(uL/L)构成特征向量。并将这些特征向量数据不断的发送到storm平台上的故障诊断模型中,从而实现对变压器的在线诊断。
在数据处理之前,首先搭建storm云平台,分别为一个主节点和若干个从节点。这里用五台服务器组成物理集群,服务器之间用千兆交换机连接。
在正式电网流式数据处理之前,通过历史数据模拟在线电网监测数据流,并且该历史数据流量大于正式数据,从而通过吞吐量计算自适应配置出最优的进程数、源组件和逻辑处理组件的并发度,能够最大限度的处理后续正式电网流式数据。
相关自适应配置完成后,开始处理正式电网流式数据。首先是数据接入,Spout源组件连接外部数据源,在数据选取中,为了防止数据集偏斜,在采集数据时通常选取多个工程现场的相同型号变压器发生故障前后的油色谱检测数据。这些数据包含正常数据以及故障类数据,是无标签样本。之后读取这些元数据到缓存区,当元数据个数满足tuple要求时即当个数满足1000时,封装这些元数据为1个tuple,并发送tuple到待处理队列,从而进行数据的后续处理。这里合适的源组件的并行度能够提高处理效率.
之后是数据预处理,Spout组件发送元组到下游预处理组件PreBolt。PreBolt通过归一化对接收到的元组进行处理,归一化也可以减少诊断模型过程中聚类 主体迭代的次数,并提高聚类的准确率。PreBolt对数据进行预处理后将相关数据组成新的元组,发送到下游组件SCMBolt。
整个分类模块包含两个组件,分别为SCMBolt和K-meansBolt,SCMBolt组件实现减法聚类算法,接收上游组件传来的元组,对元组中数据进行聚类,通过密度值确定聚类中心,此算法的聚类速度很快,得到的聚类中心为原数据中的点,且各个聚类中心相距尽可能远,从而较大程度上避免后续聚类算法陷入局部最优,当减法聚类算法完成后,得到初始聚类中心,将其与相应Id编号及此编号对应的标准化待聚类样本封装为一个元组,传递给下游组件K-meansBolt,此组件实现了硬聚类K-means算法,即对上游组件SCMBolt传来的标准化待聚类样本进行K-means聚类,在此聚类过程中初始聚类中心是从上游组件SCMBolt中获取,随后通过迭代计算得到聚类结果。K-meansBolt组件和SubBolt组件结合,其总体聚类效果相比于单一的K-meansBolt组件,在对电网设备的特征向量数据的诊断问题上,不仅减少了K-means算法主体的迭代运行次数,还增强了其鲁棒能力。
最终进行结果保存与汇总,将诊断的结果直接写入数据文件中,或持久化存储到数据库中,即根据需要,故障诊断模型的结果处理方式有DatabaseBolt和FileBolt。DatabaseBolt实现模型诊断结果到数据库的存储操作,从而便于电力及相关行业对诊断结果的查询与检索;而FileBolt组件将诊断结果存储到数据文件中,此文件可以进行灵活的复制和迁移。
实施例二:
本实施例提供一种基于混合聚类的电力设备并行故障诊断系统,包括:
平台部署模块,用于搭建storm平台,并在storm平台部署机器学习网络结构,得到故障诊断模型;
自适应配置模块,用于根据历史电网数据自适应配置storm平台中各个组件的并行度和相关进程数;
数据接入模块,用于通过IRichSpout接口将实时电网数据接入到storm平台的Spout源组件中,形成待处理数据流;
数据封装模块,用于按照时间顺序将待处理数据流封装到多个Tuple元组中,并为每个Tuple元组生成唯一的ID;
预处理模块,用于利用PreBolt组件接收Tuple元组,并通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本;
故障诊断模块,用于利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果。
本实施例的系统可用于实现实施例一所述的方法。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流 程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。

Claims (10)

  1. 一种基于混合聚类的电力设备并行故障诊断方法,其特征在于,包括以下步骤:
    根据历史电网数据自适应配置storm平台中各个组件的并行度和相关进程数;
    通过IRichSpout接口将实时电网数据接入到storm平台的Spout源组件中,形成待处理数据流;
    按照时间顺序将待处理数据流封装到多个Tuple元组中,并为每个Tuple元组生成唯一的ID;
    利用PreBolt组件接收Tuple元组,并通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本;
    利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果。
  2. 根据权利要求1所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,自适应配置storm平台中各个组件的并行度和相关进程数的方法为:
    利用历史电网数据模拟实时电网数据流,其中,历史电网数据的流量大于实时电网数据的预期流量;
    根据历史电网数据计算storm平台中各个组件在不同并行度和不同进程数下的数据吞吐量;
    在数据吞吐量满足预期吞吐量的情况下,自适应配置开销最低的组件并行度和进程数。
  3. 根据权利要求1所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化 样本,包括:
    按照下述公式对其进行归一化,公式如下:
    上式中,x`(x`∈[0,1])为归一化后的数据值;xmin为元组数据中某一维数据的最小值;xmax为这一维度数据的最大值。
  4. 根据权利要求1所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,所述故障诊断模型的构建方法包括:
    将减法聚类算法和K-means聚类算法分别部署SCMBolt组件和K-meansBolt组件中,将SCMBolt组件和K-meansBolt组件连接起来,设置组件的并行度,得到故障诊断模型。
  5. 根据权利要求4所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果,包括:
    将标准化样本通过减法聚类算法确定较优的初始聚类中心;
    将减法聚类处理所得的较优的初始聚类中心作为K-means算法的初始聚类中心,再进行聚类,从而实现该样本数据的故障诊断结果。
  6. 根据权利要求5所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,将标准化样本通过减法聚类算法确定较优的初始聚类中心,包括:
    SCMBolt组件接收PreBolt组件传递的元组,对元组中数据进行减法聚类,通过密度值确定聚类中心,得到的聚类中心为原数据中的点;
    当减法聚类算法完成后,得到初始聚类中心,将其与相应Id编号及此编号对应的标准化待聚类样本封装为一个元组,传递给下游组件K-meansBolt;
    减法聚类的方法包括:
    样本维度为M,样本点个数为n,分别为(x1,x2,...,xn);当维度较高时所有样本点归一到一个超立方体中;在此,每个样本点都可为聚类中心的候选者;则样本点xi的密度指标定义为
    上式中,ra为一个正数;ra的取值为该点的一个邻域半径,
    当每一个样本点的密度指标计算完后,选择密度指标最高的样本点作为第一个聚类中心,xc1为选中的点,Dc1是此点的密度指标;则选择下一个聚类中心时,每个样本点xi的密度指标可通过下式修正;
    上式中,rb为一个正数;
    当修正完所有样本点的密度指标后,选择出新的聚类中心xc2,再次修正所有样本点的密度指标,不断的重复该过程,直到足够多的聚类中心出现,得到较优的初始聚类中心。
  7. 根据权利要求5所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,将减法聚类处理所得的较优的初始聚类中心作为K-means算法的初始聚类中心,再进行聚类,包括:
    K-meansBolt组件对上游SCMBolt组件传来的标准化待聚类样本进行K-means聚类,在聚类过程中将上游SCMBolt组件传来的聚类中心作为K-means聚类的初始聚类中心,通过迭代实现聚类中心的更新,最终得到相关聚类结果。
  8. 根据权利要求7所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,在聚类过程中将上游SCMBolt组件传来的聚类中心作为K-means聚类的初始聚类中心,通过迭代实现聚类中心的更新,包括:
    a)将上游SCMBolt组件传来的聚类中心作为K-means聚类的初始聚类中心;
    b)计算标准样本集中所有样本到各个初始聚类中心的矢量距离,从中选择矢量距离最小的并将此样本划分到其对应的类中;
    c)更新聚类中心,即计算每一类中所有样本数据的平均值,将这些均值作 为k类别中新的聚类中心;
    d)不断执行步骤b)和步骤c),直到新得到的聚类中心不再变化或与上次得到的聚类中心相差的偏移值小于指定阈值,或算法执行的迭代次数达到指定要求,满足以上三个条件之一则停止聚类;
    e)计算结果保存与汇总。
  9. 根据权利要求8所述的基于混合聚类的电力设备并行故障诊断方法,其特征在于,计算结果保存与汇总,包括:
    通过DatabaseBolt组件实现模型诊断结果到数据库的存储操作,从而便于电力及相关行业对诊断结果的查询与检索,或者通过FileBolt组件将诊断结果存储到数据文件中,此文件可以进行灵活的复制和迁移。
  10. 一种基于混合聚类的电力设备并行故障诊断装置,其特征在于,包括:
    平台部署模块,用于搭建storm平台,并在storm平台部署机器学习网络结构,得到故障诊断模型;
    自适应配置模块,用于根据历史电网数据自适应配置storm平台中各个组件的并行度和相关进程数;
    数据接入模块,用于通过IRichSpout接口将实时电网数据接入到storm平台的Spout源组件中,形成待处理数据流;
    数据封装模块,用于按照时间顺序将待处理数据流封装到多个Tuple元组中,并为每个Tuple元组生成唯一的ID;
    预处理模块,用于利用PreBolt组件接收Tuple元组,并通过标准分数法对Tuple元组中的数据集进行预处理,得到标准化样本;
    故障诊断模块,用于利用故障诊断模型处理标准化样本,获得电力设备的故障诊断结果。
PCT/CN2023/074751 2022-07-07 2023-02-07 一种基于混合聚类的电力设备并行故障诊断方法及装置 WO2024007580A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210791423.4 2022-07-07
CN202210791423.4A CN115293236A (zh) 2022-07-07 2022-07-07 一种基于混合聚类的电力设备并行故障诊断方法及装置

Publications (1)

Publication Number Publication Date
WO2024007580A1 true WO2024007580A1 (zh) 2024-01-11

Family

ID=83822637

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074751 WO2024007580A1 (zh) 2022-07-07 2023-02-07 一种基于混合聚类的电力设备并行故障诊断方法及装置

Country Status (2)

Country Link
CN (1) CN115293236A (zh)
WO (1) WO2024007580A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293236A (zh) * 2022-07-07 2022-11-04 南京国电南自电网自动化有限公司 一种基于混合聚类的电力设备并行故障诊断方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416380A (zh) * 2018-02-28 2018-08-17 北京理工大学 一种减小客户流失风险的大数据聚类算法
CN110133410A (zh) * 2019-05-31 2019-08-16 国网河北省电力有限公司石家庄供电分公司 基于模糊c均值聚类算法的变压器故障诊断方法及系统
CN114330500A (zh) * 2021-11-30 2022-04-12 南京国电南自电网自动化有限公司 基于storm平台的电网电力设备在线并行诊断方法及系统
CN115293236A (zh) * 2022-07-07 2022-11-04 南京国电南自电网自动化有限公司 一种基于混合聚类的电力设备并行故障诊断方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416380A (zh) * 2018-02-28 2018-08-17 北京理工大学 一种减小客户流失风险的大数据聚类算法
CN110133410A (zh) * 2019-05-31 2019-08-16 国网河北省电力有限公司石家庄供电分公司 基于模糊c均值聚类算法的变压器故障诊断方法及系统
CN114330500A (zh) * 2021-11-30 2022-04-12 南京国电南自电网自动化有限公司 基于storm平台的电网电力设备在线并行诊断方法及系统
CN115293236A (zh) * 2022-07-07 2022-11-04 南京国电南自电网自动化有限公司 一种基于混合聚类的电力设备并行故障诊断方法及装置

Also Published As

Publication number Publication date
CN115293236A (zh) 2022-11-04

Similar Documents

Publication Publication Date Title
WO2020140560A1 (zh) 一种生产物流输送装备故障预警方法
CN106708016B (zh) 故障监控方法和装置
Alzghoul et al. Increasing availability of industrial systems through data stream mining
WO2020124779A1 (zh) 一种工况状态建模与修正模型方法
WO2020134361A1 (zh) 变电站二次设备状态评估方法、系统及设备
WO2020108159A1 (zh) 一种网络故障根因检测方法、系统及存储介质
CN112415331B (zh) 基于多源故障信息的电网二次系统故障诊断方法
WO2024007580A1 (zh) 一种基于混合聚类的电力设备并行故障诊断方法及装置
CN111177276A (zh) 一种基于Spark计算框架的动能数据处理系统及方法
CN110708318A (zh) 基于改进的径向基神经网络算法的网络异常流量预测方法
CN112557817A (zh) 一种基于量子免疫优化算法的有源配电网故障定位方法、系统、存储介质及计算机设备
CN115115090A (zh) 一种基于改进lstm-cnn的风功率短期预测方法
Qian et al. Grid-based Data Stream Clustering for Intrusion Detection.
CN114417971A (zh) 一种基于k近邻密度峰值聚类的电力数据异常值检测算法
CN112434923B (zh) 一种基于子空间聚类的机械产品质量分析方法
CN114330500B (zh) 基于storm平台的电网电力设备在线并行诊断方法及系统
Liang et al. Tabular data anomaly detection based on density peak clustering algorithm
Suresha Machine Learning for mining weather patterns and weather forecasting
Zhang et al. Towards unbiased training in federated open-world semi-supervised learning
CN113485878B (zh) 一种多数据中心故障检测方法
Qiao et al. Study on K-means method based on Data-Mining
Wang et al. A Novel Multi‐Input AlexNet Prediction Model for Oil and Gas Production
Wang et al. Software defect prediction incremental model using ensemble learning
Lin Retracted: Research on Optimization of Distributed Big Data Real-Time Management Method
Wang et al. Fault diagnosis of ship ballast water system based on support vector machine optimized by improved sparrow search algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23834365

Country of ref document: EP

Kind code of ref document: A1