CN114997256A - Method, device and storage medium for detecting abnormal power of wind farm - Google Patents

Method, device and storage medium for detecting abnormal power of wind farm Download PDF

Info

Publication number
CN114997256A
CN114997256A CN202210185544.4A CN202210185544A CN114997256A CN 114997256 A CN114997256 A CN 114997256A CN 202210185544 A CN202210185544 A CN 202210185544A CN 114997256 A CN114997256 A CN 114997256A
Authority
CN
China
Prior art keywords
abnormal
data
power
data set
wind
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210185544.4A
Other languages
Chinese (zh)
Inventor
戴云泽
李建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dianji University
Original Assignee
Shanghai Dianji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dianji University filed Critical Shanghai Dianji University
Priority to CN202210185544.4A priority Critical patent/CN114997256A/en
Publication of CN114997256A publication Critical patent/CN114997256A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention relates to a method, equipment and a storage medium for detecting abnormal power of a wind power plant, wherein the method comprises the following steps: s1, acquiring a power and wind speed scattered point data set D of the wind power plant; s2, identifying discrete abnormal data in the power and wind speed scattered data set by adopting a local abnormal factor LOF algorithm to obtain an intermediate data set D'; and step S3, identifying the stacking abnormal data in the intermediate data set D' by adopting an isolated forest algorithm IF. Compared with the prior art, the method has the advantages of high abnormal data detection accuracy and high detection efficiency.

Description

一种风电场异常功率的检测方法、设备及存储介质Method, device and storage medium for detecting abnormal power of wind farm

技术领域technical field

本发明涉及风电场控制领域,尤其是涉及一种风电场异常功率的检测方法、设备及存储介质。The invention relates to the field of wind farm control, in particular to a method, equipment and storage medium for detecting abnormal power of a wind farm.

背景技术Background technique

风电场在实际运行中,受不良的运行环境、机组的异常减载、测量和通信设备的故障等因素的影响会出现异常数据,同时,准确地获取风电场实际运行的风速、功率等数据是预测风电场出力的基础,基于历史数据的风电场等值出力曲线也常用于评价风电场的运行状态。因此对这些风电功率异常数据进行有效辨识很重要。In the actual operation of the wind farm, abnormal data will appear due to the influence of the bad operating environment, the abnormal load shedding of the units, and the failure of the measurement and communication equipment. The basis for predicting the output of wind farms, the equivalent output curve of wind farms based on historical data is also often used to evaluate the operating status of wind farms. Therefore, it is very important to effectively identify these abnormal wind power data.

当前对于异常数据的检测主要有以下几种方法:1)基于概率统计模型的异常检测,这种方法通过计算标准数据和实际数据的偏差来判断数据是否异常,但是标准数据集选取有一定的盲目性和不确定性;2)基于聚类的异常数据检测,这种方法将将未知标记信息的样本按一定的规则和要求划分为若干子集,将隶属于某子集程度低于阈值的样本点视为异常数据;3)基于密度和距离的异常数据检测,这种方法计算多维空间里某个单位空间内样本点的数量,此方法不需要大量的训练数据。At present, there are mainly the following methods for the detection of abnormal data: 1) Anomaly detection based on the probability and statistical model, this method judges whether the data is abnormal by calculating the deviation between the standard data and the actual data, but the selection of the standard data set has certain blindness 2) Anomaly data detection based on clustering, this method divides the samples with unknown label information into several subsets according to certain rules and requirements, and divides the samples that belong to a subset with a degree lower than the threshold. Points are regarded as abnormal data; 3) abnormal data detection based on density and distance, this method calculates the number of sample points in a certain unit space in multi-dimensional space, this method does not require a large amount of training data.

上述基于密度的异常检测算法主要通过计算数据集中每个样本点的异常水平,直观简洁地显示出不同密度数据集中的离群样本。但是,对于给定的数据集,需要得到每个数据对象与其他所有点之间的欧氏距离、以及数据点的局部可达距离,计算量较大,当数据集的规模较大时,算法的运行效率较低。The above density-based anomaly detection algorithm mainly displays outlier samples in different density datasets intuitively and concisely by calculating the anomaly level of each sample point in the dataset. However, for a given data set, it is necessary to obtain the Euclidean distance between each data object and all other points, as well as the local reachable distance of the data point, which requires a large amount of calculation. When the scale of the data set is large, the algorithm operating efficiency is low.

针对以上情况,亟需设计一种准确度高以及效率高的风电场异常数据检测方法。In view of the above situation, it is urgent to design a method for detecting abnormal data of wind farms with high accuracy and high efficiency.

发明内容SUMMARY OF THE INVENTION

本发明的目的就是为了克服上述现有技术存在的缺陷而提供了一种准确度高、检测效率高的风电场异常功率的检测方法、设备及存储介质。The purpose of the present invention is to provide a detection method, device and storage medium for abnormal power of a wind farm with high accuracy and high detection efficiency in order to overcome the above-mentioned defects of the prior art.

本发明的目的可以通过以下技术方案来实现:The object of the present invention can be realized through the following technical solutions:

本发明提供了一种风电场异常功率的检测方法,该方法包括以下步骤:The invention provides a method for detecting abnormal power of a wind farm, the method comprising the following steps:

步骤S1、获取风电场的功率和风速散点数据集D;Step S1, obtaining the power and wind speed scatter data set D of the wind farm;

步骤S2、采用局部异常因子LOF算法识别出功率和风速散点数据集中的离散型异常数据,得到中间数据集D′;Step S2, using the local abnormal factor LOF algorithm to identify discrete abnormal data in the power and wind speed scatter data set, and obtain an intermediate data set D';

步骤S3、采用孤立森林算法IF对中间数据集D′中堆叠异常数据进行识别。In step S3, the isolated forest algorithm IF is used to identify the stacked abnormal data in the intermediate data set D'.

优选地,所述步骤S1中的风电场的功率和风速散点数据集的数据包括分散型数据和堆叠型数据。Preferably, the data of the power and wind speed scatter data set of the wind farm in the step S1 includes scattered data and stacked data.

优选地,所述步骤S2具体为:计算数据集中每个数据点对应的局部异常因子LOF,将局部异常因子LOF值高于LOF阈值的数据点视为异常数据,并去除该异常数据。Preferably, the step S2 is specifically: calculating the local abnormality factor LOF corresponding to each data point in the data set, treating the data points whose local abnormality factor LOF value is higher than the LOF threshold as abnormal data, and removing the abnormal data.

优选地,所述局部异常因子LOF为k距离邻域内数据点的局部可达密度与数据点局部可达密度比值的平均值。Preferably, the local outlier factor LOF is the average value of the ratio of the local reachable density of the data points in the k-distance neighborhood to the local reachable density of the data points.

优选地,所述步骤S3具体为:Preferably, the step S3 is specifically:

1)随机采样中间数据集D′构造孤立二叉树,构建孤立森林模型;1) Randomly sample the intermediate data set D' to construct an isolated binary tree and construct an isolated forest model;

2)对于每颗孤立二叉树中的任一样本点,从根节点开始遍历孤立二叉树直至到达叶子节点,计算样本点的平均路径长度c(p),并基于异常指标函数对样本进行异常评分;2) For any sample point in each isolated binary tree, traverse the isolated binary tree from the root node until reaching the leaf node, calculate the average path length c(p) of the sample point, and perform anomaly score for the sample based on the anomaly index function;

3)将异常评分值高于评分阈值的被测点视为异常点并进行去除。3) The measured points whose abnormal score value is higher than the score threshold are regarded as abnormal points and removed.

优选地,所述异常指标函数的表达式为:Preferably, the expression of the abnormal index function is:

Figure BDA0003523211290000021
Figure BDA0003523211290000021

式中,hx为样本点x从叶子节点到根节点的路径长度,E(hx)为样本点x在所有孤立二叉树iTree的路径长度均值;c(p)为p个样本数据构建的孤立二叉树的平均路径长度。In the formula, h x is the path length of the sample point x from the leaf node to the root node, E(h x ) is the mean path length of the sample point x in all isolated binary trees iTree; c(p) is the isolated sample data constructed by p. The average path length of a binary tree.

优选地,所述样本点从叶子节点到根节点的路径长度hx的表达式为:Preferably, the expression of the path length h x of the sample point from the leaf node to the root node is:

Figure BDA0003523211290000022
Figure BDA0003523211290000022

优选地,所述平均路径长度c(p)表达式为:Preferably, the expression of the average path length c(p) is:

Figure BDA0003523211290000023
Figure BDA0003523211290000023

式中,Hp-1=ln(p-1)+ξ,ξ为欧拉常数,p为样本数。In the formula, H p-1 =ln(p-1)+ξ, ξ is Euler's constant, and p is the number of samples.

根据本发明的第二方面,提供了一种电子设备,包括存储器和处理器,所述存储器上存储有计算机程序,所述处理器执行所述程序时实现任一项所述的方法。According to a second aspect of the present invention, an electronic device is provided, comprising a memory and a processor, the memory stores a computer program, and the processor implements any one of the methods when executing the program.

根据本发明的第三方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现任一项所述的方法。According to a third aspect of the present invention, there is provided a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, any one of the methods is implemented.

与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:

1)本发明克服了现有孤立森林算法IF算法难以识别离散的异常数据点以及局部异常因子LOF算法存在的针对大数据量且堆叠在一起的异常点簇识别效果不佳的缺陷,采用LOF算法和IF算法相结合的检测算法对风电场功率和风速散点的不同特征的异常数据进行分类检测,提高了异常数据的检测精确度以及检测效率;1) The present invention overcomes the defect that the existing isolated forest algorithm IF algorithm is difficult to identify discrete abnormal data points and the local abnormal factor LOF algorithm has a poor identification effect for large data volume and stacked abnormal point clusters, and adopts the LOF algorithm. The detection algorithm combined with the IF algorithm classifies and detects abnormal data with different characteristics of wind farm power and wind speed scattered points, which improves the detection accuracy and detection efficiency of abnormal data;

2)本发明采用的孤立森林算法通过孤立二叉树实现了对数据集的多次分割,从而精准识别出分布密度高的堆叠在一起的异常数据。2) The isolated forest algorithm adopted in the present invention realizes multiple segmentation of the data set through the isolated binary tree, thereby accurately identifying abnormal data stacked together with high distribution density.

附图说明Description of drawings

图1为本发明的方法流程图;Fig. 1 is the method flow chart of the present invention;

图2为LOF算法中可达距离示意图;Figure 2 is a schematic diagram of the reachable distance in the LOF algorithm;

图3为风电场异常数据分布图;Figure 3 shows the distribution of abnormal data of wind farms;

图4为基于LOF算法的异常数据检测结果图;Fig. 4 is the abnormal data detection result graph based on LOF algorithm;

图5为基于IF算法的异常数据检测结果图。FIG. 5 is a graph of the abnormal data detection result based on the IF algorithm.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明的一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都应属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

实施例Example

如图1所示,本实施例给出了一种风电场异常功率的检测方法,该方法包括以下步骤:As shown in FIG. 1 , this embodiment provides a method for detecting abnormal power of a wind farm, and the method includes the following steps:

步骤S1、获取风电场的功率和风速散点数据集D,包括分散型数据和堆叠型数据;Step S1, obtaining the power and wind speed scattered data set D of the wind farm, including scattered data and stacked data;

步骤S2、采用局部异常因子LOF算法识别出功率和风速散点数据集中的离散型异常数据,得到中间数据集D′,具体为:Step S2, using the local abnormal factor LOF algorithm to identify discrete abnormal data in the power and wind speed scatter data set, and obtain an intermediate data set D', specifically:

计算数据集中每个数据点对应的局部异常因子LOF,将局部异常因子LOF值高于LOF阈值的数据点视为异常数据,并去除该异常数据;其中,局部异常因子LOF为k距离邻域内数据点的局部可达密度与数据点局部可达密度比值的平均值。Calculate the local abnormal factor LOF corresponding to each data point in the data set, regard the data points with the local abnormal factor LOF value higher than the LOF threshold as abnormal data, and remove the abnormal data; among them, the local abnormal factor LOF is the data in the k-distance neighborhood The average of the ratio of the local reachability density of a point to the local reachability density of the data point.

步骤S3、采用孤立森林算法IF对中间数据集D′中堆叠异常数据进行识别,包括以下子步骤:Step S3, using the isolated forest algorithm IF to identify the stacked abnormal data in the intermediate data set D', including the following sub-steps:

1)随机采样中间数据集D′构造孤立二叉树,构建孤立森林模型;1) Randomly sample the intermediate data set D' to construct an isolated binary tree and construct an isolated forest model;

2)对于每颗孤立二叉树中的任一样本点,从根节点开始遍历孤立二叉树直至到达叶子节点,计算样本点的平均路径长度c(p),并基于异常指标函数对样本进行异常评分;2) For any sample point in each isolated binary tree, traverse the isolated binary tree from the root node until reaching the leaf node, calculate the average path length c(p) of the sample point, and perform anomaly score for the sample based on the anomaly index function;

所述异常指标函数的表达式为:The expression of the abnormal indicator function is:

Figure BDA0003523211290000041
Figure BDA0003523211290000041

式中,hx为样本点x从叶子节点到根节点的路径长度,E(hx)为样本点x在所有孤立二叉树iTree的路径长度均值;c(p)为p个样本数据构建的孤立二叉树的平均路径长度。In the formula, h x is the path length of the sample point x from the leaf node to the root node, E(h x ) is the mean path length of the sample point x in all isolated binary trees iTree; c(p) is the isolated sample data constructed by p. The average path length of a binary tree.

所述样本点从叶子节点到根节点的路径长度hx的表达式为:The expression of the path length h x of the sample point from the leaf node to the root node is:

Figure BDA0003523211290000042
Figure BDA0003523211290000042

所述平均路径长度c(p)表达式为:The expression of the average path length c(p) is:

Figure BDA0003523211290000043
Figure BDA0003523211290000043

式中,Hp-1=ln(p-1)+ξ,ξ为欧拉常数,p为样本数。In the formula, H p-1 =ln(p-1)+ξ, ξ is Euler's constant, and p is the number of samples.

3)将异常评分值高于评分阈值的被测点视为异常点并进行去除。3) The measured points whose abnormal score value is higher than the score threshold are regarded as abnormal points and removed.

接下来对于本发明采用的局部异常因子算法LOF以及孤立森林算法IF进行详细阐述。Next, the local abnormal factor algorithm LOF and the isolated forest algorithm IF adopted in the present invention are described in detail.

1、基于局部异常因子(Local Outlier Factor,LOF)的异常检测方法是一种典型的基于密度的异常值检测方法;其中,对于局部异常因子用来表征数据的异常程度。1. The anomaly detection method based on the local outlier factor (LOF) is a typical density-based outlier detection method; wherein, the local outlier factor is used to characterize the degree of abnormality of the data.

定义1:k距离,距离数据点最近的点中,第k个最近的点与数据点的距离;Definition 1: k distance, among the points closest to the data point, the distance between the kth closest point and the data point;

定义2:k距离邻域,所有与数据点距离小于等于数据点的k距离的数据集合,即以数据点为中心,以k-距离为半径的区域内点的集合;Definition 2: k-distance neighborhood, all data sets whose distance from the data point is less than or equal to the k distance of the data point, that is, the set of points in the area with the data point as the center and the k-distance as the radius;

定义3:可达距离,若数据p与o距离较近,那么数据o到数据p的可达距离就是o的k-距离,当二者相距较远时,将可达距离视为二者实际距离d(p,o),如图2所示;Definition 3: Reachable distance. If the distance between data p and o is relatively close, then the reachable distance from data o to data p is the k-distance of o. When the two are far apart, the reachable distance is regarded as the actual distance between them. Distance d(p,o), as shown in Figure 2;

定义4:局部可达密度(local reachability density),局部可达密度是数据点p相对于其k-距离邻域内对象的平均可达密度值的倒数;Definition 4: local reachability density, the local reachability density is the reciprocal of the average reachability density value of the data point p relative to the objects in its k-distance neighborhood;

定义5:局部异常因子(Local Outlier Factor),对于给定k和数据点p,局部异常因子定义为k-距离邻域内数据点的局部可达密度与p的局部可达密度比值的平均值。局部异常因子直接表征一个数据点为异常点的可能性,LOF值等于1或者约小于1,说明数据点的局部可达密度接近于邻域中其他点的局部可达密度,该点为孤立点的可能性较低,LOF值越大,其孤立程度越高,越可能为异常点。Definition 5: Local Outlier Factor, for a given k and data point p, the local outlier factor is defined as the average of the ratio of the local reachable density of data points in the k-distance neighborhood to the local reachable density of p. The local anomaly factor directly represents the possibility that a data point is an anomaly. The LOF value is equal to 1 or less than 1, indicating that the local reachability density of the data point is close to the local reachability density of other points in the neighborhood, and the point is an isolated point. The probability of is lower, the larger the LOF value, the higher the degree of isolation, and the more likely it is an abnormal point.

2、孤立森林算法(Isolated Forest,IF)是一种基于集成学习树模型(TreeEmbedding)的异常检测算法。IF算法把分布稀疏,将距离存在密集分布点区域较远的点定义为异常值。IF算法基本原理为:将数据空间使用随机超平面分割,采用递归随机数次对数据子空间再次分割,这样反复切割后,知道每个子空间只剩下一个样本点即所有的数据点都被“孤立”才停止切割。由此可见,分布密度很高的数据簇群被切分多次才会被“孤立”,而分布稀疏的样本点只需要很少的分割次数就能被分割出。2. Isolated Forest (IF) is an anomaly detection algorithm based on ensemble learning tree model (TreeEmbedding). In the IF algorithm, the distribution is sparse, and the points that are far away from the densely distributed point area are defined as outliers. The basic principle of the IF algorithm is: use a random hyperplane to divide the data space, and use recursive random numbers to divide the data subspace again. After repeated cutting, we know that each subspace has only one sample point left, that is, all data points are Isolation" to stop cutting. It can be seen that the data clusters with high distribution density will be "isolated" after being segmented many times, while the sample points with sparse distribution can be segmented only a few times.

算法异常数据检测过程为:The abnormal data detection process of the algorithm is as follows:

定义被测点x从根节点node开始遍历每一棵孤立二叉树iTree直到遇到外部节点时走过的边数为路径长度h(x),计算被测点的平均路径长度,并进行异常评分,异常评分s的取值范围为(0,1),对于异常评分值s的设定如下:Define the measured point x to traverse each isolated binary tree iTree from the root node node until the number of edges traversed when it encounters an external node is the path length h(x), calculate the average path length of the measured point, and perform anomaly score, The value range of the abnormal score s is (0, 1), and the settings for the abnormal score value s are as follows:

1)当s接近1时,数据点在iTree中的分支越小,数据被判断为异常的可能性越高;1) When s is close to 1, the smaller the branch of the data point in iTree, the higher the possibility that the data is judged to be abnormal;

2)当s接近0时,数据点在iTree中的分支越大,数据被判断为异常的可能性越小;2) When s is close to 0, the larger the branch of the data point in iTree, the less likely the data is judged to be abnormal;

3)当s接近0.5时,数据点的异常特征不明显,无法判定是否异常。3) When s is close to 0.5, the abnormal characteristics of the data points are not obvious, and it is impossible to judge whether it is abnormal or not.

为了验证本发明提出的异常数据检测方法的效果,本实施例针对额定容量为16MW的某风电场,在2012年2月1日至2012年2月30日,采样间隔为15min,采集到的2880组数据进行异常检测。In order to verify the effect of the abnormal data detection method proposed by the present invention, this embodiment is aimed at a wind farm with a rated capacity of 16MW. Group data for anomaly detection.

算法参数设置如下:LOF算法中k取10,LOF阈值设置为1.5;IF算法中,树的数量即集成规模设置为100,树枝的数量即每棵树采样规模设置为256。The algorithm parameters are set as follows: in the LOF algorithm, k is set to 10, and the LOF threshold is set to 1.5; in the IF algorithm, the number of trees, that is, the integration scale, is set to 100, and the number of branches, that is, the sampling scale of each tree is set to 256.

如图3~5所示的实验结果证明:LOF算法对曲线周围分散T3型异常数据的识别效果较好,由于其他两种类型的异常数据呈现在一定区域内大量数据局部堆积、密集分布的现象,LOF算法对此识别效果并不佳,异常数据漏报严重,对于风速最大的几个点,实际上功率并未异常,但是由于数据量少,较为稀疏,LOF算法将其误报为了异常点。而IF算法利用树模型中对数据集进行多次分割将数据点分离,异常数据在树中的孤立程度比较明显,从图5中发现IF算法对分布密集的异常数据识别效果明显,但是也存在着将处于数据簇边缘的一些正常数据判定成异常数据的误报情况。The experimental results shown in Figures 3 to 5 prove that the LOF algorithm has a better identification effect on the scattered T3 abnormal data around the curve, because the other two types of abnormal data show a phenomenon of local accumulation and dense distribution of a large amount of data in a certain area. , the LOF algorithm has a poor recognition effect, and the abnormal data is seriously underreported. For the points with the largest wind speed, the power is not abnormal. However, due to the small amount of data and sparseness, the LOF algorithm falsely reported them as abnormal points. . The IF algorithm uses the tree model to divide the data set multiple times to separate the data points, and the degree of isolation of abnormal data in the tree is relatively obvious. From Figure 5, it is found that the IF algorithm has an obvious effect on the identification of densely distributed abnormal data, but there are also In this way, some normal data at the edge of the data cluster are judged as false positives of abnormal data.

本发明采用的LOF算法和IF算法相结合的风电功率异常数据检测方法能较好地识别风电场不同类型的异常数据。The wind power abnormal data detection method combined with the LOF algorithm and the IF algorithm adopted by the present invention can better identify different types of abnormal data of the wind farm.

本发明电子设备包括中央处理单元(CPU),其可以根据存储在只读存储器(ROM)中的计算机程序指令或者从存储单元加载到随机访问存储器(RAM)中的计算机程序指令,来执行各种适当的动作和处理。在RAM中,还可以存储设备操作所需的各种程序和数据。CPU、ROM以及RAM通过总线彼此相连。输入/输出(I/O)接口也连接至总线。The electronic device of the present invention includes a central processing unit (CPU) that can execute various computer program instructions stored in a read only memory (ROM) or loaded into a random access memory (RAM) from a storage unit according to computer program instructions Appropriate action and handling. In RAM, various programs and data required for device operation can also be stored. The CPU, ROM, and RAM are connected to each other through a bus. Input/output (I/O) interfaces are also connected to the bus.

设备中的多个部件连接至I/O接口,包括:输入单元,例如键盘、鼠标等;输出单元,例如各种类型的显示器、扬声器等;存储单元,例如磁盘、光盘等;以及通信单元,例如网卡、调制解调器、无线通信收发机等。通信单元允许设备通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in the device are connected to the I/O interface, including: input units, such as keyboards, mice, etc.; output units, such as various types of displays, speakers, etc.; storage units, such as magnetic disks, optical disks, etc.; and communication units, For example, network cards, modems, wireless communication transceivers, etc. The communication unit allows the device to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

处理单元执行上文所描述的各个方法和处理,例如方法S1~S3。例如,在一些实施例中,方法S1~S3可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元。在一些实施例中,计算机程序的部分或者全部可以经由ROM和/或通信单元而被载入和/或安装到设备上。当计算机程序加载到RAM并由CPU执行时,可以执行上文描述的方法S1~S3的一个或多个步骤。备选地,在其他实施例中,CPU可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行方法S1~S3。The processing unit executes the various methods and processes described above, eg, methods S1 to S3. For example, in some embodiments, the methods S1-S3 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed on the device via the ROM and/or the communication unit. When the computer program is loaded into the RAM and executed by the CPU, one or more steps of the methods S1 to S3 described above may be performed. Alternatively, in other embodiments, the CPU may be configured to perform the methods S1-S3 in any other suitable manner (eg, by means of firmware).

本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)等等。The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), System on Chips (SOCs), Load Programmable Logic Device (CPLD) and so on.

用于实施本发明的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

在本发明的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present invention, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed by the present invention. Modifications or substitutions should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (10)

1. A method for detecting abnormal power of a wind power plant is characterized by comprising the following steps:
s1, acquiring a power and wind speed scattered point data set D of the wind power plant;
s2, identifying discrete abnormal data in the power and wind speed scattered data set by adopting a local abnormal factor LOF algorithm to obtain an intermediate data set D';
and step S3, identifying the stacking abnormal data in the intermediate data set D' by adopting an isolated forest algorithm IF.
2. The method for detecting abnormal power of a wind farm according to claim 1, wherein the data of the power and wind speed scatter data set of the wind farm in the step S1 comprises scattered data and stacked data.
3. The method for detecting abnormal power of a wind farm according to claim 1, wherein the step S2 specifically comprises: and calculating a local abnormal factor LOF corresponding to each data point in the data set, regarding the data points with the local abnormal factor LOF values higher than the LOF threshold value as abnormal data, and removing the abnormal data.
4. The method for detecting abnormal power of a wind power plant according to claim 3, wherein the local abnormality factor LOF is an average value of a ratio of a local reachable density of the data points in a k-distance neighborhood to a local reachable density of the data points.
5. The method for detecting abnormal power of a wind farm according to claim 1, wherein the step S3 specifically comprises:
1) randomly sampling the intermediate data set D' to construct an isolated binary tree and construct an isolated forest model;
2) for any sample point in each isolated binary tree, traversing the isolated binary tree from a root node until reaching a leaf node, calculating the average path length c (p) of the sample point, and carrying out abnormal scoring on the sample based on an abnormal index function;
3) and (4) regarding the measured points with the abnormal scoring values higher than the scoring threshold values as abnormal points and removing the abnormal points.
6. A method for detecting abnormal power of a wind farm according to claim 5, characterized in that the expression of the abnormal index function is:
Figure FDA0003523211280000011
in the formula, h x From leaf node for sample point xPath Length from Point to root node, E (h) x ) The path length mean value of the sample point x in all isolated binary trees iTree; c (p) the average path length of the isolated binary tree constructed for p sample data.
7. The method for detecting abnormal power of wind power plant according to claim 6, wherein the path length h of the sample point from the leaf node to the root node is x The expression of (a) is:
Figure FDA0003523211280000021
8. method for detecting abnormal power of a wind farm according to claim 6, characterized in that the average path length c (p) is expressed by:
Figure FDA0003523211280000022
in the formula, H p-1 Where ═ ln (p-1) + ξ, ξ is the euler constant and p is the number of samples.
9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program, wherein the processor, when executing the program, implements the method of any of claims 1-8.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 8.
CN202210185544.4A 2022-02-28 2022-02-28 Method, device and storage medium for detecting abnormal power of wind farm Pending CN114997256A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210185544.4A CN114997256A (en) 2022-02-28 2022-02-28 Method, device and storage medium for detecting abnormal power of wind farm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210185544.4A CN114997256A (en) 2022-02-28 2022-02-28 Method, device and storage medium for detecting abnormal power of wind farm

Publications (1)

Publication Number Publication Date
CN114997256A true CN114997256A (en) 2022-09-02

Family

ID=83024098

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210185544.4A Pending CN114997256A (en) 2022-02-28 2022-02-28 Method, device and storage medium for detecting abnormal power of wind farm

Country Status (1)

Country Link
CN (1) CN114997256A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340063A (en) * 2020-02-10 2020-06-26 北京华电天仁电力控制技术有限公司 Coal mill data anomaly detection method
CN116365519A (en) * 2023-06-01 2023-06-30 国网山东省电力公司微山县供电公司 Method, system, storage medium and equipment for electric load forecasting
CN116859902A (en) * 2023-09-04 2023-10-10 西安热工研究院有限公司 Database abnormal point detection method and system for hydropower control system
CN118051796A (en) * 2024-04-16 2024-05-17 自贡市第一人民医院 Intelligent analysis method for monitoring data of disinfection supply center

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020010701A1 (en) * 2018-07-11 2020-01-16 平安科技(深圳)有限公司 Pollutant anomaly monitoring method and system, computer device, and storage medium
CN111340063A (en) * 2020-02-10 2020-06-26 北京华电天仁电力控制技术有限公司 Coal mill data anomaly detection method
CN113298297A (en) * 2021-05-10 2021-08-24 内蒙古工业大学 Wind power output power prediction method based on isolated forest and WGAN network
CN113886375A (en) * 2021-09-29 2022-01-04 东北电力大学 A wind power data cleaning method based on isolated forest and local outlier factors

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020010701A1 (en) * 2018-07-11 2020-01-16 平安科技(深圳)有限公司 Pollutant anomaly monitoring method and system, computer device, and storage medium
CN111340063A (en) * 2020-02-10 2020-06-26 北京华电天仁电力控制技术有限公司 Coal mill data anomaly detection method
CN113298297A (en) * 2021-05-10 2021-08-24 内蒙古工业大学 Wind power output power prediction method based on isolated forest and WGAN network
CN113886375A (en) * 2021-09-29 2022-01-04 东北电力大学 A wind power data cleaning method based on isolated forest and local outlier factors

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340063A (en) * 2020-02-10 2020-06-26 北京华电天仁电力控制技术有限公司 Coal mill data anomaly detection method
CN111340063B (en) * 2020-02-10 2023-08-29 国能信控互联技术有限公司 Data anomaly detection method for coal mill
CN116365519A (en) * 2023-06-01 2023-06-30 国网山东省电力公司微山县供电公司 Method, system, storage medium and equipment for electric load forecasting
CN116365519B (en) * 2023-06-01 2023-09-26 国网山东省电力公司微山县供电公司 Power load prediction method, system, storage medium and equipment
CN116859902A (en) * 2023-09-04 2023-10-10 西安热工研究院有限公司 Database abnormal point detection method and system for hydropower control system
CN118051796A (en) * 2024-04-16 2024-05-17 自贡市第一人民医院 Intelligent analysis method for monitoring data of disinfection supply center
CN118051796B (en) * 2024-04-16 2024-06-18 自贡市第一人民医院 Intelligent analysis method for monitoring data of disinfection supply center

Similar Documents

Publication Publication Date Title
CN114997256A (en) Method, device and storage medium for detecting abnormal power of wind farm
CN117591836B (en) Pipeline detection data analysis method and related device
CN109257383B (en) BGP anomaly detection method and system
CN111157850B (en) Mean value clustering-based power grid line fault identification method
CN108304567B (en) Method and system for identifying working condition mode and classifying data of high-voltage transformer
CN116167010A (en) Rapid identification method for abnormal events of power system with intelligent transfer learning capability
CN109902731B (en) A method and device for detecting performance faults based on support vector machines
CN116610938B (en) Method and equipment for detecting unsupervised abnormality of semiconductor manufacture in curve mode segmentation
CN112418355A (en) Method and system for carrying out feature analysis on abnormal points based on isolated forest algorithm
CN112949735A (en) Liquid hazardous chemical substance volatile concentration abnormity discovery method based on outlier data mining
CN115705279A (en) An intelligent fault early warning method and device based on index data
CN111522705A (en) A solution for intelligent operation and maintenance of industrial big data
CN112884167B (en) Multi-index anomaly detection method based on machine learning and application system thereof
CN118228001A (en) Platform architecture based on big data of computer
CN116365519B (en) Power load prediction method, system, storage medium and equipment
CN116910592A (en) Log detection method, device, electronic equipment and storage medium
CN115238779A (en) A kind of abnormal detection method, device, equipment and medium of cloud disk
CN115952413A (en) Abnormal battery box detection method and device based on isolated forest and electronic equipment
CN108629441A (en) Prediction technique and device based on clustering and the improved fan noise of small echo
CN114943247A (en) Waveform identification method and device for performance index time sequence data
CN115484048A (en) Intrusion behavior detection method and device based on cloud environment
CN112132173A (en) Transformer unsupervised running state identification method based on clustering feature tree
CN117194963B (en) Industrial FDC quality root cause analysis method, device and storage medium
CN119151302B (en) Method and system for managing and controlling settlement inspection risk of transformer substation building engineering
CN118917554B (en) Power grid asset monitoring and analyzing method, system and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination