CN117851877A - Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium - Google Patents

Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium Download PDF

Info

Publication number
CN117851877A
CN117851877A CN202311540515.6A CN202311540515A CN117851877A CN 117851877 A CN117851877 A CN 117851877A CN 202311540515 A CN202311540515 A CN 202311540515A CN 117851877 A CN117851877 A CN 117851877A
Authority
CN
China
Prior art keywords
wind turbine
data
turbine blade
value
scada
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311540515.6A
Other languages
Chinese (zh)
Inventor
王岁岁
周勃
常丽
孙宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang University of Technology
Original Assignee
Shenyang University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang University of Technology filed Critical Shenyang University of Technology
Priority to CN202311540515.6A priority Critical patent/CN117851877A/en
Publication of CN117851877A publication Critical patent/CN117851877A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Wind Motors (AREA)

Abstract

本发明提供一种基于SCADA数据的关联分析的风电叶片断裂预警方法。该方法包括:通过对风电叶片断裂前后的SCADA数据进行降维,以第一次筛选出与风电叶片运行状态有关的各参数。对风电叶片正常运行状态下的SCADA数据中的筛选出的参数对应的原始数据进行数据清洗,对数据清洗后的SCADA数据中的筛选出的各参数对应的数据提取,并将提取到的数据划分为训练集和验证集,再利用训练集进行NSET建模。利用历史数据验证及进一步参数筛选,以最终筛选出与风电叶片运行状态有关的各参数。输入实时观测数据可根据模型输出残差,得到欧氏距离曲线,以此来判断此时的风电叶片运行状况。本发明在已有的SCADA数据基础上充分挖掘其叶片断裂预警的阈值,从而达到在线实时监测的目的。

The present invention provides a wind turbine blade fracture warning method based on the association analysis of SCADA data. The method comprises: reducing the dimension of the SCADA data before and after the wind turbine blade fracture, so as to first screen out various parameters related to the operating state of the wind turbine blade. Data cleaning is performed on the original data corresponding to the screened parameters in the SCADA data under the normal operating state of the wind turbine blade, and the data corresponding to the screened parameters in the SCADA data after data cleaning are extracted, and the extracted data are divided into a training set and a verification set, and then the training set is used for NSET modeling. Historical data verification and further parameter screening are used to finally screen out various parameters related to the operating state of the wind turbine blade. The real-time observation data can be input to obtain the Euclidean distance curve according to the residual output of the model, so as to judge the operating status of the wind turbine blade at this time. The present invention fully explores the threshold of blade fracture warning based on the existing SCADA data, so as to achieve the purpose of online real-time monitoring.

Description

基于SCADA数据关联分析的风电叶片断裂预警方法、装置、可 读存储介质Wind turbine blade fracture warning method, device, and readable storage medium based on SCADA data correlation analysis

技术领域Technical Field

本发明涉及风电叶片监测的技术领域,具体而言,涉及一种基于SCADA数据关联分析的风电叶片断裂预警方法、计算机装置和计算机可读存储介质。The present invention relates to the technical field of wind turbine blade monitoring, and in particular to a wind turbine blade fracture early warning method based on SCADA data association analysis, a computer device and a computer-readable storage medium.

背景技术Background technique

风电叶片是将风能转化为机械能,最终驱动发电机发电的关键部件,风电装备各部件维修费用比率中,风电叶片所占比率最高为29.55%,除此以外,风电叶片的价值占了整个机组的两成以上。风电叶片在恶劣的条件下运行,尽管它们的设计和测试寿命为20~25年,但经验表明,要实现这一目标,需要进行预防性维护。随着风力发电事业的不断发展,及日益增加的社会用电需求,风电机组大型化发展趋势越来越明显,而叶片长度增加的同时,也增大了叶片断裂的风险。风电叶片的断裂不仅可能会造成自身风电机组烧机,还可能因为飞出的叶片残骸影响到其他机组,在造成巨大的经济损失的同时还增加了安全隐患。因此,对风电叶片的断裂监测显得尤为重要。Wind turbine blades are key components that convert wind energy into mechanical energy and ultimately drive generators to generate electricity. Among the maintenance cost ratios of various components of wind power equipment, wind turbine blades account for the highest proportion of 29.55%. In addition, the value of wind turbine blades accounts for more than 20% of the entire unit. Wind turbine blades operate under harsh conditions. Although their design and test life is 20 to 25 years, experience shows that preventive maintenance is required to achieve this goal. With the continuous development of wind power generation and the increasing social electricity demand, the trend of large-scale development of wind turbines is becoming more and more obvious. As the length of blades increases, the risk of blade breakage also increases. The breakage of wind turbine blades may not only cause the burning of the wind turbine itself, but also affect other units due to the flying blade debris, causing huge economic losses and increasing safety hazards. Therefore, it is particularly important to monitor the breakage of wind turbine blades.

随着我国风电机组大型化、规模化的发展趋势,由于不同型号风电机组叶片的翼型、结构、材料、外形存在巨大差异,其损伤程度的评定标准难以统一量化,而且不同风场条件下运行时失效断裂阈值也将动态变化,因此对大型风电叶片进行实时断裂监测而言,断裂阈值的动态设置问题是一个难题。另外,目前风电叶片和智能传感器一体化成型技术尚不成熟,如果在巨型风电叶片内部空腔或外壁蒙皮加装各类传感器,不但容易掉落或者失效,而且叶片监测系统的成本造价也无法控制。因此,风电叶片断裂预警方法应同时考虑断裂报警阈值的动态变化和成本问题,才能保证其更具有普适性,应用范围更加广泛。With the development trend of large-scale and large-scale wind turbines in my country, due to the huge differences in the airfoil, structure, material and shape of the blades of different types of wind turbines, it is difficult to unify and quantify the assessment standards of their damage degree, and the failure fracture threshold will also change dynamically when operating under different wind field conditions. Therefore, for real-time fracture monitoring of large wind turbine blades, the dynamic setting of the fracture threshold is a difficult problem. In addition, the integrated molding technology of wind turbine blades and intelligent sensors is not yet mature. If various sensors are installed in the internal cavity or outer wall skin of giant wind turbine blades, they are not only easy to fall or fail, but also the cost of the blade monitoring system cannot be controlled. Therefore, the wind turbine blade fracture warning method should consider the dynamic changes of the fracture alarm threshold and the cost issue at the same time to ensure that it is more universal and has a wider range of applications.

目前,国内外已有的风电机组监测系统大多都采用了无模型的人工智能算法,依赖大量的样本数据和模型的精确,建模参数的筛选往往没有依据,计算过程也缺少可靠性分析和验证。特别是,叶片监测系统采集已有的SCADA系统传感器信号,存在着数据量大、维度大、冗余变量多等数据清洗问题,会导致在线监测效率低下,从而降低模型的准确率,难以实时在线监测。At present, most of the existing wind turbine monitoring systems at home and abroad use model-free artificial intelligence algorithms, which rely on a large amount of sample data and the accuracy of the model. The selection of modeling parameters is often without basis, and the calculation process lacks reliability analysis and verification. In particular, the blade monitoring system collects the existing SCADA system sensor signals, and there are data cleaning problems such as large data volume, large dimensions, and many redundant variables, which will lead to low efficiency of online monitoring, thereby reducing the accuracy of the model and making it difficult to monitor online in real time.

发明内容Summary of the invention

本发明旨在至少解决现有技术或相关技术中存在的技术问题之一。The present invention aims to solve at least one of the technical problems existing in the prior art or related art.

为此,本发明的第一目的在于提出一种基于SCADA数据的关联分析的风电叶片断裂预警方法。To this end, the first objective of the present invention is to propose a wind turbine blade fracture early warning method based on correlation analysis of SCADA data.

本发明的第二目的在于提出一种计算机装置。A second objective of the present invention is to provide a computer device.

本发明的第三目的在于提出一种计算机可读存储介质。A third objective of the present invention is to provide a computer-readable storage medium.

为了实现上述目的,本发明的第一方面的技术方案,提供了一种基于SCADA数据的关联分析的风电叶片断裂预警方法,所述SCADA数据是风电机组的SCADA系统中的传感器组采集到的数据,所述SCADA数据包括的各个参数为:风电机组的功率、发电机转速、转子转速、风电叶片角度、网侧电流和风速;所述预警方法包括:获取某一段时间的风电叶片断裂前后的第一组SCADA数据;使用卡方验证对所述第一组SCADA数据进行筛选,以使用卡方验证筛选出与风电叶片运行状态有关的各个参数;将风速这一参数加入到使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中,以第一次筛选出与风电叶片运行状态有关的各个参数;获取某一段时间的风电叶片正常运行状态下的第二组SCADA数据;对所述第二组SCADA数据中的所述第一次筛选出与风电叶片运行状态有关的各个参数对应的原始数据进行提取;根据数据提取后的第二组SCADA数据中的风速的数据和风电机组的功率的数据,拟合风速-功率散点图;采用DBSCAN聚类算法对所述风速-功率散点图进行数据清洗;将数据清洗后的第二组SCADA数据划分为第一训练集和第一验证集;根据所述第一训练集和第一验证集,搭建风电叶片正常运行状态下的第一个NSET模型;将原始的第一组SCADA数据输入到所述第一个NSET模型,计算所述第一个NSET模型输出的预测值和对应的观测值之间的残差En,以及根据所述预测值和观测值去拟合所述第一组SCADA数据对应的第一欧氏距离曲线,以观察到所述第一欧氏距离曲线在断裂点延迟的某个位置后开始处于上升趋势;根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率ei;去除所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中的故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数;对数据清洗后的第二组SCADA数据中的所述第二次筛选出与风电叶片运行状态有关的各个参数对应的数据进行提取,并将提取到的数据划分为第二训练集和第二验证集;根据所述第二训练集和第二验证集,建立风电叶片正常运行状态下的第二个NSET模型;获取所述第二验证集对应的欧氏距离曲线,并将所述第二验证集对应的欧氏距离曲线上的欧氏距离的最大值作为第一阈值;将原始的第一组SCADA数据输入到所述第二个NSET模型,计算所述第二个NSET模型输出的预测值和对应的观测值之间的残差En',以及根据所述预测值和观测值去拟合所述第一组SCADA数据对应的第二欧氏距离曲线,以观察到所述第二欧氏距离曲线在断裂点的位置开始处于上升趋势;获取所述断裂点之前的第二欧氏距离曲线上的欧氏距离的最大值作为第二阈值;获取所述断裂点之后的第二欧氏距离曲线上的欧氏距离的最大值作为第三阈值;实时获取风电机组的SCADA系统中的传感器组采集到的SCADA数据:将实时获取到的SCADA数据输入到所述第二个NSET模型,以及计算所述第二个NSET模型输出的预测值和对应的观测值之间的残差En”,以及根据所述预测值和观测值去拟合所述实时获取到的SCADA数据对应的第三欧氏距离曲线;根据所述第三欧氏距离曲线上对应的各个欧氏距离的值,以及第一阈值、第二阈值、第三阈值,对风电机组的风电叶片进行断裂预警。In order to achieve the above-mentioned purpose, the technical solution of the first aspect of the present invention provides a wind turbine blade fracture early warning method based on the correlation analysis of SCADA data, wherein the SCADA data is data collected by a sensor group in the SCADA system of the wind turbine set, and the various parameters included in the SCADA data are: wind turbine power, generator speed, rotor speed, wind turbine blade angle, grid-side current and wind speed; the early warning method comprises: obtaining a first group of SCADA data before and after the wind turbine blade fracture in a certain period of time; using chi-square verification to screen the first group of SCADA data, so as to use chi-square verification to screen out various parameters related to the operating state of the wind turbine blade; adding the wind speed parameter to the various parameters related to the operating state of the wind turbine blade screened out by using chi-square verification, so as to screen out various parameters related to the operating state of the wind turbine blade for the first time; obtaining The second group of SCADA data under the normal operating state of the wind turbine blades in a certain period of time is obtained; the original data corresponding to the various parameters related to the operating state of the wind turbine blades screened out for the first time in the second group of SCADA data are extracted; a wind speed-power scatter plot is fitted according to the wind speed data and the power data of the wind turbine generator set in the second group of SCADA data after data extraction; the wind speed-power scatter plot is cleaned by using the DBSCAN clustering algorithm; the second group of SCADA data after data cleaning is divided into a first training set and a first validation set; according to the first training set and the first validation set, a first NSET model under the normal operating state of the wind turbine blades is built; the original first group of SCADA data is input into the first NSET model, and the residual E between the predicted value output by the first NSET model and the corresponding observed value is calculated. n , and fitting the first Euclidean distance curve corresponding to the first group of SCADA data according to the predicted value and the observed value, so as to observe that the first Euclidean distance curve begins to be in an upward trend after a certain position of the breakpoint delay; calculating the fault cumulative contribution rate e i of each parameter related to the operating state of the wind turbine blade screened out using the chi-square validation according to the residual E n ; removing the fault cumulative contribution rate e i of each parameter related to the operating state of the wind turbine blade screened out using the chi-square validation i is less than a certain set value, and the wind speed parameter is added to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time; extract the data corresponding to the various parameters related to the operating state of the wind turbine blades screened out for the second time in the second group of SCADA data after data cleaning, and divide the extracted data into a second training set and a second verification set; according to the second training set and the second verification set, establish a second NSET model under the normal operating state of the wind turbine blades; obtain the Euclidean distance curve corresponding to the second verification set, and use the maximum value of the Euclidean distance on the Euclidean distance curve corresponding to the second verification set as the first threshold; input the original first group of SCADA data into the second NSET model, and calculate the residual E n between the predicted value output by the second NSET model and the corresponding observed value ', and fitting the second Euclidean distance curve corresponding to the first group of SCADA data according to the predicted value and the observed value, so as to observe that the second Euclidean distance curve begins to be in an upward trend at the position of the break point; obtaining the maximum value of the Euclidean distance on the second Euclidean distance curve before the break point as the second threshold value; obtaining the maximum value of the Euclidean distance on the second Euclidean distance curve after the break point as the third threshold value; obtaining the SCADA data collected by the sensor group in the SCADA system of the wind turbine in real time: inputting the SCADA data obtained in real time into the second NSET model, and calculating the residual E n ' between the predicted value output by the second NSET model and the corresponding observed value, and fitting the third Euclidean distance curve corresponding to the SCADA data obtained in real time according to the predicted value and the observed value; according to the values of each Euclidean distance corresponding to the third Euclidean distance curve, as well as the first threshold value, the second threshold value and the third threshold value, the wind turbine blade of the wind turbine is given a fracture warning.

优选地,所述根据所述第三欧氏距离曲线上的欧氏距离的值,以及第一阈值、第二阈值、第三阈值,对风电机组的风电叶片进行断裂预警,具体包括:确定所述第三欧氏距离曲线上对应的各个欧氏距离的值所处的范围;当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于等于0且小于等于所述第一阈值时,则判断风电机组的风电叶片处于正常运行状态;当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于所述第一阈值且小于等于所述第二阈值时,则判断风电机组的风电叶片处于异常状态;当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于等于所述第三阈值时,则判断风电机组的风电叶片处于已断裂状态。Preferably, the fracture warning of the wind turbine blades of the wind turbine set according to the value of the Euclidean distance on the third Euclidean distance curve, as well as the first threshold, the second threshold and the third threshold, specifically includes: determining the range of the values of each corresponding Euclidean distance on the third Euclidean distance curve; when the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than or equal to 0 and less than or equal to the first threshold, it is judged that the wind turbine blades of the wind turbine set are in a normal operating state; when the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than the first threshold and less than or equal to the second threshold, it is judged that the wind turbine blades of the wind turbine set are in an abnormal state; when the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than or equal to the third threshold, it is judged that the wind turbine blades of the wind turbine set are in a broken state.

优选地,所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数分别为:风电机组的功率、发电机转速、转子转速、风电叶片角度、第一网侧电流、第二网侧电流和第三网侧电流;所述第一次筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速、风电叶片角度、第一网侧电流、第二网侧电流和第三网侧电流;Preferably, the parameters related to the operating state of the wind turbine blades screened out using the chi-square verification are: wind turbine power, generator speed, rotor speed, wind turbine blade angle, first grid-side current, second grid-side current and third grid-side current; the parameters related to the operating state of the wind turbine blades screened out for the first time are: wind speed, wind turbine power, generator speed, rotor speed, wind turbine blade angle, first grid-side current, second grid-side current and third grid-side current;

所述残差En的表达式为:The expression of the residual E n is:

En=xn-xn * En = xn - xn *

其中,xn为所述第一组SCADA数据中的第n个参数的观测值;xn *为所述第一组SCADA数据中的第n个参数对应的预测值;En为所述第n个参数的观测值和所述第n个参数对应的预测值的差;x1至x7为所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数;以及所述根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率ei,具体包括:Wherein, x n is the observed value of the nth parameter in the first group of SCADA data; x n * is the predicted value corresponding to the nth parameter in the first group of SCADA data; En is the difference between the observed value of the nth parameter and the predicted value corresponding to the nth parameter; x 1 to x 7 are the parameters related to the operating state of the wind turbine blade screened out using the chi-square validation; and the calculation of the fault cumulative contribution rate e i of the parameters related to the operating state of the wind turbine blade screened out using the chi-square validation according to the residual En specifically includes:

根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障次数ecCalculating the number of failures e c of each parameter related to the wind turbine blade operation state selected by using chi-square validation according to the residual E n ;

所述故障次数ec的表达式为:The expression of the fault number e c is:

令a=观测向量组数/每次输入的观测量组数,并根据所述故障次数ec计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率eiLet a = number of observation vector groups/number of observation quantity groups input each time, and calculate the cumulative fault contribution rate e i of each parameter related to the wind turbine blade operation state screened by using chi-square validation according to the number of faults e c ;

所述故障累计贡献率ei的表达式为:The expression of the cumulative contribution rate of faults e i is:

以及所述去除所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中的故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数,具体包括:去除x1至x7中的所述故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数;其中,所述第二次筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速和风电叶片角度。And the removing of the parameters whose cumulative fault contribution rate e i is less than a certain set value from the various parameters related to the operating state of the wind turbine blades screened out by using the chi-square verification, and adding the wind speed parameter to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time, specifically includes: removing the parameters whose cumulative fault contribution rate e i is less than a certain set value from x 1 to x 7 , and adding the wind speed parameter to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time; wherein the various parameters related to the operating state of the wind turbine blades screened out for the second time are: wind speed, wind turbine power, generator speed, rotor speed and wind turbine blade angle.

优选地,所述根据所述第一训练集和第一验证集,建立风电叶片正常运行状态下的第一个NSET模型,具体包括:获取原始的NSET模型,将所述第一训练集输入原始的NSET模型,以得到训练好的NSET模型;Preferably, establishing a first NSET model of a wind turbine blade in a normal operating state according to the first training set and the first validation set specifically includes: obtaining an original NSET model, inputting the first training set into the original NSET model, so as to obtain a trained NSET model;

将风电机组的整体观测矩阵表示为一个n×b大小的Mn×b,所述Mn×b的表达式为:The overall observation matrix of the wind turbine is represented as an n×b-sized M n×b , and the expression of M n×b is:

其中,n为时间状态,b为每个时间的观测变量数;矩阵Mn×b的行向量为Xi=[xi(t1)xi(t2)...xi(tb)],矩阵Mn×b的行向量为某一给定观测参数Xi在某个观测时间段内的所有观测值;矩阵Mn×b的列向量为X(tj)=[x1(tj) x2(tj)...xb(tj)]T,矩阵Mn×b的列向量为tj时刻所有观测参数的观测值;从所述Mn×b中选取一段时间的参数记为历史观测矩阵K,历史观测矩阵K为各个观测参数的健康状态,则历史观测矩阵K的表达式为:Wherein, n is the time state, b is the number of observed variables at each time; the row vector of the matrix Mn ×b is Xi = [ xi ( t1 ) xi ( t2 ) ... xi ( tb )], and the row vector of the matrix Mn ×b is all the observed values of a given observation parameter Xi within a certain observation time period; the column vector of the matrix Mn ×b is X( tj ) = [ x1 ( tj ) x2 ( tj ) ... xb ( tj )] T , and the column vector of the matrix Mn ×b is the observed values of all observed parameters at time tj ; the parameters of a period of time are selected from the Mn ×b and recorded as the historical observation matrix K, and the historical observation matrix K is the health status of each observation parameter, then the expression of the historical observation matrix K is:

从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn。过程矩阵Dn可表示为:A portion of state data is selected from the historical observation matrix K, and the selected portion of state data is used to form a process memory matrix D n . The process matrix D n can be expressed as:

将观测矩阵Xobs和所述观测矩阵Xobs对应的Dn输入到表达式中,得到预测输出矩阵Xest;将所述第一验证集设置为观测矩阵Xobs;将所述第一验证集输入到训练好的NSET模型;输入第一验证集后的NSET模型输出对应的预测输出矩阵Xest;计算所述第一验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest之间的残差,以及根据所述第一验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest去拟合所述第一验证集对应的欧氏距离曲线,以建立风电叶片正常运行状态下的第一个NSET模型;以及所述根据所述第二训练集和第二验证集,建立风电叶片正常运行状态下的第二个NSET模型,具体包括:获取原始的NSET模型,将所述第二训练集输入原始的NSET模型,以得到训练好的NSET模型;将所述第二验证集设置为观测矩阵Xobs;将所述第二验证集输入到训练好的NSET模型;输入第二验证集后的NSET模型输出对应的预测输出矩阵Xest;计算所述第二验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest之间的残差,以及根据所述第二验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest去拟合所述第二验证集对应的欧氏距离曲线,以建立风电叶片正常运行状态下的第二个NSET模型。The observation matrix X obs and D n corresponding to the observation matrix X obs are input into the expression , obtain a prediction output matrix X est ; set the first validation set as an observation matrix X obs ; input the first validation set into the trained NSET model; the NSET model after inputting the first validation set outputs a corresponding prediction output matrix X est ; calculate the residual between the observation matrix X obs corresponding to the first validation set and the corresponding prediction output matrix X est , and fit the Euclidean distance curve corresponding to the first validation set according to the observation matrix X obs corresponding to the first validation set and the corresponding prediction output matrix X est to establish a first NSET model in a normal operating state of the wind turbine blade; and the second NSET model under normal operating state of the wind turbine blade is established according to the second training set and the second validation set, specifically including: obtaining an original NSET model, inputting the second training set into the original NSET model to obtain a trained NSET model; setting the second validation set as an observation matrix X obs ; inputting the second validation set into the trained NSET model; the NSET model after inputting the second validation set outputs a corresponding prediction output matrix X est ; calculate the residual between the observation matrix X obs corresponding to the second validation set and the corresponding prediction output matrix X est , and fit the Euclidean distance curve corresponding to the first validation set according to the observation matrix X obs corresponding to the second validation set and the corresponding prediction output matrix X est est is used to fit the Euclidean distance curve corresponding to the second validation set to establish a second NSET model under normal operating conditions of the wind turbine blade.

优选地,所述从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn,具体包括:将历史观测矩阵K的每一个观测向量设置为由n个变量组成;对所述n个变量中的每一个变量,将[0,1]之间等分为h份,以1/h为步距从所述历史观测矩阵K中查找出若干个观测向量X(1) X(2)...X(k)加入所述过程记忆矩阵Dn中;Preferably, the selecting a part of state data from the historical observation matrix K and using the selected part of state data to form a process memory matrix Dn specifically includes: setting each observation vector of the historical observation matrix K to be composed of n variables; for each of the n variables, dividing the interval [0,1] into h equal parts, searching for a number of observation vectors X(1) X(2)...X(k) from the historical observation matrix K with a step size of 1/h and adding them to the process memory matrix Dn ;

所述对所述n个变量中的每一个变量,将[0,1]之间等分为h份,以1/h为步距从所述历史观测矩阵K中查找出若干个观测向量X(1)X(2)...X(k)加入所述过程记忆矩阵Dn中,具体包括:设置i=1;其中,i为正整数;执行A=1/h*i;其中,h为正整数;设置k=1;其中,k为正整数;判断|X(k)-A|是否小于δ;其中,δ为一正数;当|X(k)-A|小于δ时,添加X(k)到所述过程记忆矩阵Dn中;当|X(k)-A|大于等于δ时,判断k是否大于M;其中,M为所述历史观测矩阵K的列数;当k小于等于M时,执行k=k+1,返回所述判断|X(k)-A|是否小于δ的步骤;当k大于M时,判断i是否大于h;当i小于等于h时,执行i=i+1,返回所述执行A=1/h*i的步骤;当i大于h时,执行结束。For each of the n variables, the interval [0, 1] is equally divided into h parts, and a number of observation vectors X(1)X(2)...X(k) are found from the historical observation matrix K with a step size of 1/h and added to the process memory matrix D n , specifically including: setting i=1; wherein i is a positive integer; executing A=1/h*i; wherein h is a positive integer; setting k=1; wherein k is a positive integer; judging whether |X(k)-A| is less than δ; wherein δ is a positive number; when |X(k)-A| is less than δ, adding X(k) to the process memory matrix D n n ; when |X(k)-A| is greater than or equal to δ, determine whether k is greater than M; wherein M is the number of columns of the historical observation matrix K; when k is less than or equal to M, execute k=k+1, and return to the step of determining whether |X(k)-A| is less than δ; when k is greater than M, determine whether i is greater than h; when i is less than or equal to h, execute i=i+1, and return to the step of executing A=1/h*i; when i is greater than h, execution ends.

优选地,所述使用卡方验证对所述第一组SCADA数据进行筛选,以使用卡方验证筛选出与风电叶片运行状态有关的各个参数,具体包括:建立原始假设H0,所述原始假设H0为所述风电叶片运行状态与所述第一组SCADA数据中的各个参数之间是独立的;将风电叶片运行状态的数据作为第一变量阈值;每次将所述第一组SCADA数据中的某一个参数作为第二变量阈值;分别记录所述第一变量阈值在风电叶片正常下的数据个数的实际值为a、在风电叶片故障下的数据个数的实际值为b、以及风电叶片的总体数据个数的实际值为a+b;分别记录所述SCADA数据中的第y个参数在风电叶片正常下的数据个数的实际值为cy、在风电叶片故障下的数据个数的实际值为dy、以及风电叶片的总体数据个数的实际值为cy+dy,其中,y为正整数;分别记录所述第一变量阈值和第二变量阈值两者在风电叶片正常下的总数据个数的实际值为a+cy,以及两者在风电叶片故障下的总数据个数的实际值为b+dy,以及两者的风电叶片的总体数据个数的实际值为a+b+cy+dy;分别计算得到所述第一变量阈值在风电叶片正常下的数据个数的理论值为(a+b)×(a+cy)/(a+b+cy+dy)、在风电叶片故障下的数据个数的理论值为(a+b)×(b+dy)/(a+b+cy+dy)、以及风电叶片的总体数据个数的理论值为a+b;分别计算得到所述第二变量阈值在风电叶片正常下的数据个数的理论值为(cy+dy)×(a+cy)/(a+b+cy+dy)、在风电叶片故障下的数据个数的理论值为(cy+dy)×(b+dy)/(a+b+cy+dy)、以及风电叶片的总体数据个数的理论值为cy+dy;设置自由度为1;根据卡方值计算公式计算卡方值,所述卡方值用来衡量各个所述实际值与各个所述理论值的差异程度,所述卡方值计算公式为:χ2=∑(A-T)2/TPreferably, the method of using chi-square validation to screen the first group of SCADA data to screen out various parameters related to the operating state of the wind turbine blades by using chi-square validation specifically includes: establishing an original hypothesis H0, wherein the original hypothesis H0 is that the operating state of the wind turbine blades is independent of various parameters in the first group of SCADA data; using the data of the operating state of the wind turbine blades as the first variable threshold; using a certain parameter in the first group of SCADA data as the second variable threshold each time; recording respectively the actual value of the number of data of the first variable threshold when the wind turbine blades are normal as a, the actual value of the number of data of the wind turbine blades when the wind turbine blades are faulty as b, and the actual value of the total number of data of the wind turbine blades as a+b; recording respectively the actual value of the number of data of the yth parameter in the SCADA data when the wind turbine blades are normal as cy , the actual value of the number of data of the wind turbine blades when the wind turbine blades are faulty as dy , and the actual value of the total number of data of the wind turbine blades as cy + dy , wherein y is a positive integer; recording respectively the actual value of the total number of data of the first variable threshold and the second variable threshold when the wind turbine blades are normal as a+ cy , and the actual value of the total number of data of the two under the condition of wind turbine blade failure is b+ dy , and the actual value of the total number of data of the two wind turbine blades is a+b+ cy + dy ; the theoretical value of the number of data of the first variable threshold under normal wind turbine blade is (a+b)×(a+ cy )/(a+b+ cy + dy ), the theoretical value of the number of data under wind turbine blade failure is (a+b)×(b+ dy )/(a+ b + cy + dy ), and the theoretical value of the total number of data of wind turbine blade is a+b; the theoretical value of the number of data of the second variable threshold under normal wind turbine blade is (cy+ dy )×(a+ cy ) /(a+b+ cy + dy ), and the theoretical value of the number of data of wind turbine blade failure is ( cy +dy)×(b+ dy )/(a+b+ cy + dy ), and the theoretical value of the total number of data of wind turbine blades is cy + dy ; the degree of freedom is set to 1; the chi-square value is calculated according to the chi-square value calculation formula, the chi-square value is used to measure the difference between each of the actual values and each of the theoretical values, and the chi-square value calculation formula is: χ 2 =∑(AT) 2 /T

其中,χ2为卡方值,A为各个所述实际值,T为各个所述理论值;根据自由度和卡方值表,找到对应的P值,所述P值为犯第一类弃真错误的概率;选择P值小于0.05时对应的所述第一组SCADA数据中的某一个参数作为与风电叶片运行状态有关的参数,以依次筛选出与风电叶片运行状态有关的各个参数。Wherein, χ 2 is the chi-square value, A is each actual value, and T is each theoretical value; according to the degree of freedom and chi-square value table, find the corresponding P value, and the P value is the probability of committing a first-type true rejection error; select a parameter in the first group of SCADA data corresponding to the P value less than 0.05 as the parameter related to the operating state of the wind turbine blade, so as to screen out the various parameters related to the operating state of the wind turbine blade in turn.

优选地,所述采用DBSCAN聚类算法对所述风速-功率散点图进行数据清洗,具体包括:将所述风速-功率散点图中的所有点作为样本数据S,并将所述样本数据S中的每个数据点标记为未处理状态;对Eps和Minpts赋初始值,其中,Eps为所述样本数据S中的某一数据点p的邻域距离阈值,Minpts为所述样本数据S中的某一数据点p的半径为Eps的邻域中数据点个数的最小个数;将所述某一数据点p的半径为Eps的邻域设置为NEps(p);对Eps和Minpts形成的高密度区域进行聚类;所述对Eps和Minpts形成的高密度区域进行聚类,具体包括:判断所述某个数据点p是否已经加入某个簇或者已经被列为噪声;如果所述某个数据点p已经加入某个簇或者已经被列为噪声,则分类结束;如果所述某个数据点p没有加入某个簇且没有被列为噪声,则判断NEps(p)内是否至少有Minpts个对象;如果NEps(p)内至少有Minpts个对象,则构建新的类簇U,并在U中添加所述某个数据点p;如果NEps(p)内的对象个数少于Minpts,则将所述某个数据点p列为边界点或噪声。Preferably, the DBSCAN clustering algorithm is used to clean the wind speed-power scatter plot, specifically including: taking all points in the wind speed-power scatter plot as sample data S, and marking each data point in the sample data S as unprocessed; assigning initial values to Eps and Minpts, wherein Eps is the neighborhood distance threshold of a certain data point p in the sample data S, and Minpts is the minimum number of data points in the neighborhood with a radius of Eps of a certain data point p in the sample data S; setting the neighborhood with a radius of Eps of the certain data point p to N Eps (p); clustering the high-density area formed by Eps and Minpts; the clustering of the high-density area formed by Eps and Minpts specifically includes: determining whether the certain data point p has been added to a cluster or has been listed as noise; if the certain data point p has been added to a cluster or has been listed as noise, the classification ends; if the certain data point p has not been added to a cluster and has not been listed as noise, determining whether there are at least Minpts objects in N Eps (p); if N If there are at least Minpts objects in Eps (p), a new cluster U is constructed and the data point p is added to U; if the number of objects in N Eps (p) is less than Minpts, the data point p is listed as a boundary point or noise.

优选地,所述Eps设置为4.5,以及所述Minpts设置为18.5。Preferably, the Eps is set to 4.5, and the Minpts is set to 18.5.

本发明的第二方面的技术方案,还提供了一种计算机装置,该计算机装置包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上述任一技术方案中的基于SCADA数据的关联分析的风电叶片断裂预警方法的步骤。The technical solution of the second aspect of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the steps of the wind turbine blade fracture warning method based on correlation analysis of SCADA data as in any of the above technical solutions are implemented.

本发明的第三方面的技术方案,还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上述任一技术方案中的基于SCADA数据的关联分析的风电叶片断裂预警方法的步骤。The technical solution of the third aspect of the present invention also provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps of the wind turbine blade fracture warning method based on correlation analysis of SCADA data as in any of the above technical solutions are implemented.

本发明的有益效果:Beneficial effects of the present invention:

1、本发明提供的基于SCADA数据的关联分析的风电叶片断裂预警方法,通过针对现有的技术空白采用了一些技术手段,突破了风电叶片无法利用已有的SCADA数据实时监测叶片断裂的技术难题。具体地,风电机组的SCADA系统中的传感器组采集到的数据包含许多冗余变量,会导致建模时间增加而导致效率下降,降低模型准确率的问题。针对以上SCADA数据量大和维度大的问题,本发明提出了利用卡方验证的方法对与叶片运行状态有关的SCADA数据进行初步筛选,实现了对SCADA数据进行降维处理,提高了监测系统的计算效率。1. The wind turbine blade fracture early warning method based on correlation analysis of SCADA data provided by the present invention adopts some technical means to address the existing technical gaps, thus breaking through the technical difficulty that wind turbine blades cannot use existing SCADA data to monitor blade fractures in real time. Specifically, the data collected by the sensor group in the SCADA system of the wind turbine unit contains many redundant variables, which will increase the modeling time and lead to a decrease in efficiency, thereby reducing the accuracy of the model. In response to the above problems of large SCADA data volume and large dimensionality, the present invention proposes a chi-square verification method to perform preliminary screening of SCADA data related to the blade operating status, thereby achieving dimensionality reduction processing of SCADA data and improving the computational efficiency of the monitoring system.

2、针对SCADA数据中的空值、奇点、限功率数据和噪声的数据,采用DBSCAN算法对风速-功率散点图进行聚类,去除停机数据、限功率数据、噪声等干扰数据进行数据清洗。2. For the null values, singular points, power limit data and noise data in the SCADA data, the DBSCAN algorithm is used to cluster the wind speed-power scatter plot, and the interruption data such as shutdown data, power limit data, noise, etc. are removed for data cleaning.

3、以往的监测系统大多都要依赖样本数据和模型的精确性,建模参数的筛选往往没有依据,计算过程也缺少可靠性分析和验证,本发明首先对建模参数进行了卡方验证初步量化筛选,建立正常服役状态下的NSET模型作为半监督模型,将已知风电叶片断裂数据集输入到该模型中,计算观测值与预测值的残差及整体欧氏距离曲线验证建模效果,并分析各参数的故障贡献率来进一步筛选SCADA系统与叶片有关的参数,使得建模参数的筛选和验证更加可靠和准确。3. Most of the previous monitoring systems rely on the accuracy of sample data and models. The screening of modeling parameters often has no basis, and the calculation process lacks reliability analysis and verification. The present invention first performs a preliminary quantitative screening of the modeling parameters using chi-square verification, establishes the NSET model under normal service status as a semi-supervised model, inputs the known wind turbine blade fracture data set into the model, calculates the residuals between the observed values and the predicted values and the overall Euclidean distance curve to verify the modeling effect, and analyzes the failure contribution rate of each parameter to further screen the parameters related to the SCADA system and the blades, making the screening and verification of the modeling parameters more reliable and accurate.

4、针对叶片不同程度损伤难以量化的问题,本发明提出利用风电叶片正常运行状态的SCADA数据建立半监督学习模型,采用数据融合的方法,对风速、功率、发电机转速、转子转速、叶片角度进行数据融合,将叶片正常状态下的整体数据融合结果作为判断叶片断裂的对比参数,通过叶片实时监测数据与正常状态下的欧氏距离作为叶片断裂的定量指标。4. In order to solve the problem that blade damage of different degrees is difficult to quantify, the present invention proposes to establish a semi-supervised learning model using the SCADA data of the normal operating status of wind turbine blades, and adopts the data fusion method to fuse the wind speed, power, generator speed, rotor speed, and blade angle. The overall data fusion result of the blade in the normal state is used as a comparison parameter for judging blade fracture, and the Euclidean distance between the real-time monitoring data of the blade and the normal state is used as a quantitative indicator of blade fracture.

5、针对叶片断裂预警阈值的动态设置问题,本发明基于风电机组已有的SCADA系统历史数据,在不加装其他传感器的情况下,将某一段时间的风电叶片断裂前后的SCADA数据输入到模型中,提出利用断裂叶片前后的SCADA数据对风电叶片断裂前的欧氏距离曲线进行比较分析,获取模型断裂前的最大欧氏距离,得到风电叶片断裂预警阈值,由此实时更新叶片运行服役状态的断裂阈值。5. In order to solve the problem of dynamically setting the blade fracture warning threshold, the present invention is based on the existing SCADA system historical data of the wind turbine set. Without installing other sensors, the SCADA data before and after the wind turbine blade fracture in a certain period of time is input into the model. It is proposed to use the SCADA data before and after the fracture of the blade to compare and analyze the Euclidean distance curve before the wind turbine blade fracture, obtain the maximum Euclidean distance before the model fracture, and obtain the wind turbine blade fracture warning threshold, thereby updating the fracture threshold of the blade operation service status in real time.

综上,本发明提出了通过卡方验证将叶片与SCADA系统中的传感器信号进行关联量化初步筛选,再利用故障数据对模型的输入参数进行进一步验证筛选,最后将筛选出来的SCADA参数建立一个风电叶片正常状态下的NSET模型作为半监督模型对叶片运行状态进行监测,利用已知风电叶片断裂数据集验证建模结果并设置预警阈值,可以在已有的SCADA数据基础上充分挖掘其叶片断裂预警的阈值,从而达到在线实时高效监测的目的。本发明提供的基于SCADA数据的关联分析的风电叶片断裂预警方法,能够实现各类风电叶片的在线实时断裂预警,更具有普适性,能够有效避免巨大的经济损失和安全隐患,推广应用前景广阔。In summary, the present invention proposes to associate and quantify the sensor signals in the blade and SCADA system for preliminary screening through chi-square verification, and then use fault data to further verify and screen the input parameters of the model, and finally use the screened SCADA parameters to establish an NSET model under the normal state of the wind turbine blade as a semi-supervised model to monitor the operation status of the blade, and use the known wind turbine blade fracture data set to verify the modeling results and set the warning threshold. The threshold of the blade fracture warning can be fully mined based on the existing SCADA data, so as to achieve the purpose of online real-time and efficient monitoring. The wind turbine blade fracture warning method based on the association analysis of SCADA data provided by the present invention can realize online real-time fracture warning of various types of wind turbine blades, is more universal, can effectively avoid huge economic losses and safety hazards, and has broad prospects for promotion and application.

本发明的附加方面和优点将在下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will become apparent from the following description, or may be learned by practice of the invention.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1示出了本发明的一个实施例的基于SCADA数据的关联分析的风电叶片断裂预警方法的示意流程图;FIG1 shows a schematic flow chart of a wind turbine blade fracture early warning method based on correlation analysis of SCADA data according to an embodiment of the present invention;

图2示出了本发明的一个实施例的SCADA数据的参数筛选与建模的示意流程图;FIG. 2 shows a schematic flow chart of parameter screening and modeling of SCADA data according to an embodiment of the present invention;

图3示出了本发明的一个实施例的未筛选参数对应的训练模型输出的欧氏距离曲线图;FIG3 shows a Euclidean distance curve graph of the training model output corresponding to the unscreened parameters according to an embodiment of the present invention;

图4示出了本发明的一个实施例的采用卡方验证和DBSCAN聚类算法筛选后的参数对应的训练模型输出的欧氏距离曲线图;FIG4 shows a Euclidean distance curve graph of a training model output corresponding to parameters selected by using chi-square validation and DBSCAN clustering algorithm according to an embodiment of the present invention;

图5示出了本发明的一个实施例的最终模型输出的欧氏距离曲线图;FIG5 shows a Euclidean distance curve diagram of the final model output of an embodiment of the present invention;

图6示出了本发明的一个实施例的过程记忆矩阵构造程序的示意流程图;FIG6 shows a schematic flow chart of a process memory matrix construction procedure according to an embodiment of the present invention;

图7示出了本发明的一个实施例的风电叶片断裂实时监测的示意流程图;FIG7 shows a schematic flow chart of real-time monitoring of wind turbine blade fracture according to an embodiment of the present invention;

图8示出了本发明的一个实施例的将最终筛选出的参数对应的数据中的验证集输入最终建立的叶片正常工况下的NSET模型中时输出的欧氏距离曲线图;FIG8 shows a Euclidean distance curve graph output when a verification set in the data corresponding to the finally screened parameters is input into the finally established NSET model under normal working conditions of the blade according to an embodiment of the present invention;

图9示出了本发明的一个实施例的风电叶片预警阈值划分分布图;FIG9 shows a distribution diagram of wind turbine blade warning thresholds according to an embodiment of the present invention;

图10示出了本发明的一个实施例的将1月份的原始的风电叶片断裂前后的SCADA数据输入到最终建立的叶片正常工况下的NSET模型中时输出的欧氏距离曲线图;FIG10 shows a Euclidean distance curve graph output when the original SCADA data before and after the wind turbine blade fracture in January is input into the NSET model under the normal working condition of the blade finally established according to an embodiment of the present invention;

图11示出了本发明的一个实施例的将2月份的原始的风电叶片断裂前后的SCADA数据输入到最终建立的叶片正常工况下的NSET模型中时输出的欧氏距离曲线图;FIG11 shows a Euclidean distance curve graph output when the original SCADA data before and after the wind turbine blade fracture in February is input into the NSET model under the normal working condition of the blade finally established according to an embodiment of the present invention;

图12示出了本发明的一个实施例的将3月1日至3月10日的原始的风电叶片断裂前后的SCADA数据输入到最终建立的叶片正常工况下的NSET模型中时输出的欧氏距离曲线图;FIG12 shows a Euclidean distance curve graph output when the original SCADA data before and after the wind turbine blade fracture from March 1 to March 10 is input into the NSET model under the normal working condition of the blade finally established according to an embodiment of the present invention;

图13示出了本发明的一个实施例的将3月10日至3月14日的原始的风电叶片断裂前后的SCADA数据输入到最终建立的叶片正常工况下的NSET模型中时输出的欧氏距离曲线图;FIG13 shows a Euclidean distance curve graph output when the original SCADA data before and after the wind turbine blade fracture from March 10 to March 14 is input into the NSET model under the normal working condition of the blade finally established according to an embodiment of the present invention;

图14示出了本发明的一个实施例的计算机装置的示意框图。FIG. 14 shows a schematic block diagram of a computer device according to an embodiment of the present invention.

具体实施方式Detailed ways

为了能够更清楚地理解本发明的上述目的、特征和优点,下面结合附图和具体实施方式对本发明进行进一步的详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互结合。In order to more clearly understand the above-mentioned purpose, features and advantages of the present invention, the present invention is further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present application and the features in the embodiments can be combined with each other without conflict.

在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是,本发明还可以采用其他不同于在此描述的其他方式来实施,因此,本发明的保护范围并不限于下面公开的具体实施例的限制。In the following description, many specific details are set forth to facilitate a full understanding of the present invention. However, the present invention may also be implemented in other ways different from those described herein. Therefore, the protection scope of the present invention is not limited to the specific embodiments disclosed below.

图1示出了本发明的一个实施例的基于SCADA数据的关联分析的风电叶片断裂预警方法的示意流程图。如图1所示,通过对风电叶片断裂前后的历史SCADA数据进行降维,以第一次筛选出与风电叶片运行状态有关的各个传感器参数。对某一段时间的风电叶片正常运行状态下的SCADA数据中的筛选出的参数对应的原始数据进行数据清洗,对数据清洗后的风电叶片正常运行状态下的SCADA数据中的筛选出的各个传感器参数对应的数据进行提取,并将提取到的数据划分为训练集和验证集,再利用训练集进行NSET建模(NSET建模),以建立风电叶片正常运行状态下的NSET模型(建立正常状态模型)。并利用历史数据验证及进一步参数筛选,以最终筛选出与风电叶片运行状态有关的各个参数。进一步地,输入实时相关观测数据可根据模型输出残差,得到整体欧氏距离曲线及各参数预测值与观测值的拟合曲线,以此来判断此时的风电叶片运行状况,进一步判断叶片是否有断裂的风险,若超过预警阈值则通知风场工作人员。FIG1 shows a schematic flow chart of a wind turbine blade fracture early warning method based on SCADA data association analysis according to an embodiment of the present invention. As shown in FIG1 , by reducing the dimension of the historical SCADA data before and after the wind turbine blade fracture, the various sensor parameters related to the wind turbine blade operating state are screened out for the first time. The original data corresponding to the screened parameters in the SCADA data of the wind turbine blade in the normal operating state for a period of time are cleaned, and the data corresponding to the screened sensor parameters in the SCADA data of the wind turbine blade in the normal operating state after data cleaning are extracted, and the extracted data are divided into a training set and a verification set, and then the training set is used for NSET modeling (NSET modeling) to establish an NSET model (establishing a normal state model) under the normal operating state of the wind turbine blade. And historical data verification and further parameter screening are used to finally screen out various parameters related to the operating state of the wind turbine blade. Further, the input real-time related observation data can be used to obtain the overall Euclidean distance curve and the fitting curve of the predicted value and the observed value of each parameter according to the model output residual, so as to judge the operating status of the wind turbine blade at this time, and further judge whether the blade has the risk of fracture, and notify the wind farm staff if it exceeds the warning threshold.

具体地,SCADA数据是风电机组的SCADA系统中的传感器组采集到的数据,所述SCADA数据包括的各个参数为:风电机组的功率、发电机转速、转子转速、风电叶片角度、网侧电流和风速;该基于SCADA数据的关联分析的风电叶片断裂预警方法包括:获取某一段时间的风电叶片断裂前后的第一组SCADA数据;使用卡方验证对所述第一组SCADA数据进行筛选,以使用卡方验证筛选出与风电叶片运行状态有关的各个参数;将风速这一参数加入到使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中,以第一次筛选出与风电叶片运行状态有关的各个参数;获取某一段时间的风电叶片正常运行状态下的第二组SCADA数据;对所述第二组SCADA数据中的所述第一次筛选出与风电叶片运行状态有关的各个参数对应的原始数据进行提取;根据数据提取后的第二组SCADA数据中的风速的数据和风电机组的功率的数据,拟合风速-功率散点图;采用DBSCAN聚类算法对所述风速-功率散点图进行数据清洗;将数据清洗后的第二组SCADA数据划分为第一训练集和第一验证集;根据所述第一训练集和第一验证集,搭建风电叶片正常运行状态下的第一个NSET模型;将原始的第一组SCADA数据输入到所述第一个NSET模型,计算所述第一个NSET模型输出的预测值和对应的观测值之间的残差En,以及根据所述预测值和观测值去拟合所述第一组SCADA数据对应的第一欧氏距离曲线,以观察到所述第一欧氏距离曲线在断裂点延迟的某个位置后开始处于上升趋势;根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率ei;去除所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中的故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数;对数据清洗后的第二组SCADA数据中的所述第二次筛选出与风电叶片运行状态有关的各个参数对应的数据进行提取,并将提取到的数据划分为第二训练集和第二验证集;根据所述第二训练集和第二验证集,建立风电叶片正常运行状态下的第二个NSET模型;获取所述第二验证集对应的欧氏距离曲线,并将所述第二验证集对应的欧氏距离曲线上的欧氏距离的最大值作为第一阈值;将原始的第一组SCADA数据输入到所述第二个NSET模型,计算所述第二个NSET模型输出的预测值和对应的观测值之间的残差En',以及根据所述预测值和观测值去拟合所述第一组SCADA数据对应的第二欧氏距离曲线,以观察到所述第二欧氏距离曲线在断裂点的位置开始处于上升趋势;获取所述断裂点之前的第二欧氏距离曲线上的欧氏距离的最大值作为第二阈值;获取所述断裂点之后的第二欧氏距离曲线上的欧氏距离的最大值作为第三阈值;实时获取风电机组的SCADA系统中的传感器组采集到的SCADA数据:将实时获取到的SCADA数据输入到所述第二个NSET模型,以及计算所述第二个NSET模型输出的预测值和对应的观测值之间的残差En”,以及根据所述预测值和观测值去拟合所述实时获取到的SCADA数据对应的第三欧氏距离曲线;根据所述第三欧氏距离曲线上对应的各个欧氏距离的值,以及第一阈值、第二阈值、第三阈值,对风电机组的风电叶片进行断裂预警。Specifically, SCADA data is data collected by a sensor group in a SCADA system of a wind turbine generator set, and the SCADA data includes various parameters: power of the wind turbine generator set, generator speed, rotor speed, wind turbine blade angle, grid-side current and wind speed; the wind turbine blade fracture early warning method based on correlation analysis of SCADA data includes: obtaining a first group of SCADA data before and after the wind turbine blade fracture in a certain period of time; using chi-square verification to screen the first group of SCADA data to screen out various parameters related to the operating status of the wind turbine blade using chi-square verification; adding the wind speed parameter to the various parameters related to the operating status of the wind turbine blade screened out using chi-square verification to screen out various parameters related to the operating status of the wind turbine blade for the first time; obtaining the normal operating status of the wind turbine blade in a certain period of time. The second group of SCADA data under normal operation is extracted from the second group of SCADA data; the original data corresponding to the various parameters related to the operation state of the wind turbine blades that are first screened out are extracted from the second group of SCADA data; a wind speed-power scatter plot is fitted according to the wind speed data and the power data of the wind turbine generator set in the second group of SCADA data after data extraction; the wind speed-power scatter plot is cleaned by using the DBSCAN clustering algorithm; the second group of SCADA data after data cleaning is divided into a first training set and a first validation set; according to the first training set and the first validation set, a first NSET model under normal operation of the wind turbine blades is built; the original first group of SCADA data is input into the first NSET model, and the residual E between the predicted value output by the first NSET model and the corresponding observed value is calculated. n , and fitting the first Euclidean distance curve corresponding to the first group of SCADA data according to the predicted value and the observed value, so as to observe that the first Euclidean distance curve begins to be in an upward trend after a certain position of the breakpoint delay; calculating the fault cumulative contribution rate e i of each parameter related to the operating state of the wind turbine blade screened out using the chi-square validation according to the residual E n ; removing the fault cumulative contribution rate e i of each parameter related to the operating state of the wind turbine blade screened out using the chi-square validation i is less than a certain set value, and the wind speed parameter is added to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time; extract the data corresponding to the various parameters related to the operating state of the wind turbine blades screened out for the second time in the second group of SCADA data after data cleaning, and divide the extracted data into a second training set and a second verification set; according to the second training set and the second verification set, establish a second NSET model under the normal operating state of the wind turbine blades; obtain the Euclidean distance curve corresponding to the second verification set, and use the maximum value of the Euclidean distance on the Euclidean distance curve corresponding to the second verification set as the first threshold; input the original first group of SCADA data into the second NSET model, and calculate the residual E n between the predicted value output by the second NSET model and the corresponding observed value ', and fitting the second Euclidean distance curve corresponding to the first group of SCADA data according to the predicted value and the observed value, so as to observe that the second Euclidean distance curve begins to be in an upward trend at the position of the break point; obtaining the maximum value of the Euclidean distance on the second Euclidean distance curve before the break point as the second threshold value; obtaining the maximum value of the Euclidean distance on the second Euclidean distance curve after the break point as the third threshold value; obtaining the SCADA data collected by the sensor group in the SCADA system of the wind turbine in real time: inputting the SCADA data obtained in real time into the second NSET model, and calculating the residual E n ' between the predicted value output by the second NSET model and the corresponding observed value, and fitting the third Euclidean distance curve corresponding to the SCADA data obtained in real time according to the predicted value and the observed value; according to the values of each Euclidean distance corresponding to the third Euclidean distance curve, as well as the first threshold value, the second threshold value and the third threshold value, the wind turbine blade of the wind turbine is given a fracture warning.

在本实施例中,一开始风电机组的SCADA系统中的传感器组采集到的数据包括:风电机组的功率、发电机转速、转子转速、风电叶片角度、网侧电流和风速等参数对应的数据。In this embodiment, the data collected by the sensor group in the SCADA system of the wind turbine set at the beginning include: data corresponding to parameters such as wind turbine set power, generator speed, rotor speed, wind turbine blade angle, grid-side current and wind speed.

具体实施例中,获取某一段时间的风电叶片断裂前后的第一组SCADA数据中的“某一段时间”,该某一段时间可以是一分钟的时间,时间戳为0.02s,共3000个数据。第二个NSET模型为最终建立的叶片正常工况下的NSET模型。In a specific embodiment, the "certain period of time" in the first set of SCADA data before and after the wind turbine blade breaks is obtained. The certain period of time can be one minute, the timestamp is 0.02s, and there are 3000 data in total. The second NSET model is the NSET model under the normal working condition of the blade finally established.

在本发明的一个实施例中,所述根据所述第三欧氏距离曲线上的欧氏距离的值,以及第一阈值、第二阈值、第三阈值,对风电机组的风电叶片进行断裂预警,具体包括:确定所述第三欧氏距离曲线上对应的各个欧氏距离的值所处的范围;当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于等于0且小于等于所述第一阈值时,则判断风电机组的风电叶片处于正常运行状态;当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于所述第一阈值且小于等于所述第二阈值时,则判断风电机组的风电叶片处于异常状态;当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于等于所述第三阈值时,则判断风电机组的风电叶片处于已断裂状态。In one embodiment of the present invention, the fracture warning of the wind turbine blades of the wind turbine set according to the value of the Euclidean distance on the third Euclidean distance curve, as well as the first threshold, the second threshold, and the third threshold, specifically includes: determining the range of the values of each corresponding Euclidean distance on the third Euclidean distance curve; when the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than or equal to 0 and less than or equal to the first threshold, it is judged that the wind turbine blades of the wind turbine set are in a normal operating state; when the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than the first threshold and less than or equal to the second threshold, it is judged that the wind turbine blades of the wind turbine set are in an abnormal state; when the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than or equal to the third threshold, it is judged that the wind turbine blades of the wind turbine set are in a broken state.

在本发明的一个实施例中,所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数分别为:风电机组的功率、发电机转速、转子转速、风电叶片角度、第一网侧电流、第二网侧电流和第三网侧电流;所述第一次筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速、风电叶片角度、第一网侧电流、第二网侧电流和第三网侧电流;In one embodiment of the present invention, the parameters related to the operating state of the wind turbine blades screened out using chi-square verification are: wind turbine power, generator speed, rotor speed, wind turbine blade angle, first grid-side current, second grid-side current and third grid-side current; the parameters related to the operating state of the wind turbine blades screened out for the first time are: wind speed, wind turbine power, generator speed, rotor speed, wind turbine blade angle, first grid-side current, second grid-side current and third grid-side current;

所述残差En的表达式为:The expression of the residual E n is:

En=xn-xn * En = xn - xn *

其中,xn为所述第一组SCADA数据中的第n个参数的观测值;xn *为所述第一组SCADA数据中的第n个参数对应的预测值;En为所述第n个参数的观测值和所述第n个参数对应的预测值的差;x1至x7为所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数;以及所述根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率ei,具体包括:Wherein, x n is the observed value of the nth parameter in the first group of SCADA data; x n * is the predicted value corresponding to the nth parameter in the first group of SCADA data; En is the difference between the observed value of the nth parameter and the predicted value corresponding to the nth parameter; x 1 to x 7 are the parameters related to the operating state of the wind turbine blade screened out using the chi-square validation; and the calculation of the fault cumulative contribution rate e i of the parameters related to the operating state of the wind turbine blade screened out using the chi-square validation according to the residual En specifically includes:

根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障次数ecCalculating the number of failures e c of each parameter related to the wind turbine blade operation state screened out using chi-square validation according to the residual E n ;

所述故障次数ec的表达式为:The expression of the fault number e c is:

令a=观测向量组数/每次输入的观测量组数,并根据所述故障次数ec计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率eiLet a = number of observation vector groups/number of observation quantity groups input each time, and calculate the cumulative fault contribution rate e i of each parameter related to the wind turbine blade operation state screened by using chi-square validation according to the number of faults e c ;

所述故障累计贡献率ei的表达式为:The expression of the cumulative contribution rate of faults e i is:

以及所述去除所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中的故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数,具体包括:去除x1至x7中的所述故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数;其中,所述第二次筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速和风电叶片角度。And the removing of the parameters whose cumulative fault contribution rate e i is less than a certain set value from the various parameters related to the operating state of the wind turbine blades screened out by using the chi-square validation, and adding the wind speed parameter to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time, specifically includes: removing the parameters whose cumulative fault contribution rate e i is less than a certain set value from x 1 to x 7 , and adding the wind speed parameter to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time; wherein the various parameters related to the operating state of the wind turbine blades screened out for the second time are: wind speed, wind turbine power, generator speed, rotor speed and wind turbine blade angle.

在本发明的一个实施例中,所述根据所述第一训练集和第一验证集,建立风电叶片正常运行状态下的第一个NSET模型,具体包括:获取原始的NSET模型,将所述第一训练集输入原始的NSET模型,以得到训练好的NSET模型;In one embodiment of the present invention, establishing a first NSET model of a wind turbine blade in a normal operating state according to the first training set and the first validation set specifically includes: obtaining an original NSET model, inputting the first training set into the original NSET model, so as to obtain a trained NSET model;

将风电机组的整体观测矩阵表示为一个n×b大小的Mn×b,所述Mn×b的表达式为:The overall observation matrix of the wind turbine is represented as an n×b-sized M n×b , and the expression of M n×b is:

其中,n为时间状态,b为每个时间的观测变量数;矩阵Mn×b的行向量为Xi=[xi(t1)xi(t2)...xi(tb)],矩阵Mn×b的行向量为某一给定观测参数Xi在某个观测时间段内的所有观测值;矩阵Mn×b的列向量为X(tj)=[x1(tj) x2(tj)...xb(tj)]T,矩阵Mn×b的列向量为tj时刻所有观测参数的观测值;从所述Mn×b中选取一段时间的参数记为历史观测矩阵K,历史观测矩阵K为各个观测参数的健康状态,则历史观测矩阵K的表达式为:Wherein, n is the time state, b is the number of observed variables at each time; the row vector of the matrix Mn ×b is Xi = [ xi ( t1 ) xi ( t2 ) ... xi ( tb )], and the row vector of the matrix Mn ×b is all the observed values of a given observation parameter Xi within a certain observation time period; the column vector of the matrix Mn ×b is X( tj ) = [ x1 ( tj ) x2 ( tj ) ... xb ( tj )] T , and the column vector of the matrix Mn ×b is the observed values of all observed parameters at time tj ; the parameters of a period of time selected from the Mn ×b are recorded as the historical observation matrix K, and the historical observation matrix K is the health status of each observation parameter, then the expression of the historical observation matrix K is:

从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn。过程矩阵Dn可表示为:A portion of state data is selected from the historical observation matrix K, and the selected portion of state data is used to form a process memory matrix D n . The process matrix D n can be expressed as:

将观测矩阵Xobs和所述观测矩阵Xobs对应的Dn输入到表达式中,得到预测输出矩阵Xest;将所述第一验证集设置为观测矩阵Xobs;将所述第一验证集输入到训练好的NSET模型;输入第一验证集后的NSET模型输出对应的预测输出矩阵Xest;计算所述第一验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest之间的残差,以及根据所述第一验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest去拟合所述第一验证集对应的欧氏距离曲线,以建立风电叶片正常运行状态下的第一个NSET模型;以及所述根据所述第二训练集和第二验证集,建立风电叶片正常运行状态下的第二个NSET模型,具体包括:获取原始的NSET模型,将所述第二训练集输入原始的NSET模型,以得到训练好的NSET模型;将所述第二验证集设置为观测矩阵Xobs;将所述第二验证集输入到训练好的NSET模型;输入第二验证集后的NSET模型输出对应的预测输出矩阵Xest;计算所述第二验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest之间的残差,以及根据所述第二验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest去拟合所述第二验证集对应的欧氏距离曲线,以建立风电叶片正常运行状态下的第二个NSET模型。The observation matrix X obs and D n corresponding to the observation matrix X obs are input into the expression , obtain a prediction output matrix X est ; set the first validation set as an observation matrix X obs ; input the first validation set into the trained NSET model; the NSET model after inputting the first validation set outputs a corresponding prediction output matrix X est ; calculate the residual between the observation matrix X obs corresponding to the first validation set and the corresponding prediction output matrix X est , and fit the Euclidean distance curve corresponding to the first validation set according to the observation matrix X obs corresponding to the first validation set and the corresponding prediction output matrix X est to establish a first NSET model in a normal operating state of the wind turbine blade; and the second NSET model under normal operating state of the wind turbine blade is established according to the second training set and the second validation set, specifically including: obtaining an original NSET model, inputting the second training set into the original NSET model to obtain a trained NSET model; setting the second validation set as an observation matrix X obs ; inputting the second validation set into the trained NSET model; the NSET model after inputting the second validation set outputs a corresponding prediction output matrix X est ; calculate the residual between the observation matrix X obs corresponding to the second validation set and the corresponding prediction output matrix X est , and fit the Euclidean distance curve corresponding to the first validation set according to the observation matrix X obs corresponding to the second validation set and the corresponding prediction output matrix X est est is used to fit the Euclidean distance curve corresponding to the second validation set to establish a second NSET model under normal operating conditions of the wind turbine blade.

在本发明的一个实施例中,所述使用卡方验证对所述第一组SCADA数据进行筛选,以使用卡方验证筛选出与风电叶片运行状态有关的各个参数,具体包括:建立原始假设H0,所述原始假设H0为所述风电叶片运行状态与所述第一组SCADA数据中的各个参数之间是独立的;将风电叶片运行状态的数据作为第一变量阈值;每次将所述第一组SCADA数据中的某一个参数作为第二变量阈值;分别记录所述第一变量阈值在风电叶片正常下的数据个数的实际值为a、在风电叶片故障下的数据个数的实际值为b、以及风电叶片的总体数据个数的实际值为a+b;分别记录所述SCADA数据中的第y个参数在风电叶片正常下的数据个数的实际值为cy、在风电叶片故障下的数据个数的实际值为dy、以及风电叶片的总体数据个数的实际值为cy+dy,其中,y为正整数;分别记录所述第一变量阈值和第二变量阈值两者在风电叶片正常下的总数据个数的实际值为a+cy,以及两者在风电叶片故障下的总数据个数的实际值为b+dy,以及两者的风电叶片的总体数据个数的实际值为a+b+cy+dy;分别计算得到所述第一变量阈值在风电叶片正常下的数据个数的理论值为(a+b)×(a+cy)/(a+b+cy+dy)、在风电叶片故障下的数据个数的理论值为(a+b)×(b+dy)/(a+b+cy+dy)、以及风电叶片的总体数据个数的理论值为a+b;分别计算得到所述第二变量阈值在风电叶片正常下的数据个数的理论值为(cy+dy)×(a+cy)/(a+b+cy+dy)、在风电叶片故障下的数据个数的理论值为(cy+dy)×(b+dy)/(a+b+cy+dy)、以及风电叶片的总体数据个数的理论值为cy+dy;设置自由度为1;根据卡方值计算公式计算卡方值,所述卡方值用来衡量各个所述实际值与各个所述理论值的差异程度,所述卡方值计算公式为:χ2=∑(A-T)2/TIn one embodiment of the present invention, the method of using chi-square validation to screen the first group of SCADA data to screen out various parameters related to the operating state of the wind turbine blades specifically includes: establishing an original hypothesis H0, wherein the original hypothesis H0 is that the operating state of the wind turbine blades is independent of various parameters in the first group of SCADA data; using the data of the operating state of the wind turbine blades as the first variable threshold; using a certain parameter in the first group of SCADA data as the second variable threshold each time; recording the actual value of the number of data of the first variable threshold when the wind turbine blades are normal as a, the actual value of the number of data of the wind turbine blades when the wind turbine blades are faulty as b, and the actual value of the total number of data of the wind turbine blades as a+b; recording the actual value of the number of data of the yth parameter in the SCADA data when the wind turbine blades are normal as cy , the actual value of the number of data of the wind turbine blades when the wind turbine blades are faulty as dy , and the actual value of the total number of data of the wind turbine blades as cy + dy , wherein y is a positive integer; recording the actual value of the total number of data of the first variable threshold and the second variable threshold when the wind turbine blades are normal as a+ cy , and the actual value of the total number of data of the two under the condition of wind turbine blade failure is b+ dy , and the actual value of the total number of data of the two wind turbine blades is a+b+ cy + dy ; the theoretical value of the number of data of the first variable threshold under normal wind turbine blade is (a+b)×(a+ cy )/(a+b+ cy + dy ), the theoretical value of the number of data under wind turbine blade failure is (a+b)×(b+ dy )/(a+ b + cy + dy ), and the theoretical value of the total number of data of wind turbine blade is a+b; the theoretical value of the number of data of the second variable threshold under normal wind turbine blade is (cy+ dy )×(a+ cy ) /(a+b+ cy + dy ), and the theoretical value of the number of data of wind turbine blade failure is ( cy +dy)×(b+ dy )/(a+b+ cy + dy ), and the theoretical value of the total number of data of wind turbine blades is cy + dy ; the degree of freedom is set to 1; the chi-square value is calculated according to the chi-square value calculation formula, the chi-square value is used to measure the difference between each of the actual values and each of the theoretical values, and the chi-square value calculation formula is: χ 2 =∑(AT) 2 /T

其中,χ2为卡方值,A为各个所述实际值,T为各个所述理论值;根据自由度和卡方值表,找到对应的P值,所述P值为犯第一类弃真错误的概率;选择P值小于0.05时对应的所述第一组SCADA数据中的某一个参数作为与风电叶片运行状态有关的参数,以依次筛选出与风电叶片运行状态有关的各个参数。Wherein, χ 2 is the chi-square value, A is each actual value, and T is each theoretical value; according to the degree of freedom and chi-square value table, find the corresponding P value, and the P value is the probability of committing a first-type true rejection error; select a parameter in the first group of SCADA data corresponding to the P value less than 0.05 as the parameter related to the operating state of the wind turbine blade, so as to screen out the various parameters related to the operating state of the wind turbine blade in turn.

在本发明的一个实施例中,所述采用DBSCAN聚类算法对所述风速-功率散点图进行数据清洗,具体包括:将所述风速-功率散点图中的所有点作为样本数据S,并将所述样本数据S中的每个数据点标记为未处理状态;对Eps和Minpts赋初始值,其中,Eps为所述样本数据S中的某一数据点p的邻域距离阈值,Minpts为所述样本数据S中的某一数据点p的半径为Eps的邻域中数据点个数的最小个数;将所述某一数据点p的半径为Eps的邻域设置为NEps(p);对Eps和Minpts形成的高密度区域进行聚类;所述对Eps和Minpts形成的高密度区域进行聚类,具体包括:判断所述某个数据点p是否已经加入某个簇或者已经被列为噪声;如果所述某个数据点p已经加入某个簇或者已经被列为噪声,则分类结束;如果所述某个数据点p没有加入某个簇且没有被列为噪声,则判断NEps(p)内是否至少有Minpts个对象;如果NEps(p)内至少有Minpts个对象,则构建新的类簇U,并在U中添加所述某个数据点p;如果NEps(p)内的对象个数少于Minpts,则将所述某个数据点p列为边界点或噪声。In one embodiment of the present invention, the DBSCAN clustering algorithm is used to clean the wind speed-power scatter plot, specifically including: taking all points in the wind speed-power scatter plot as sample data S, and marking each data point in the sample data S as unprocessed; assigning initial values to Eps and Minpts, wherein Eps is the neighborhood distance threshold of a certain data point p in the sample data S, and Minpts is the minimum number of data points in the neighborhood with a radius of Eps of a certain data point p in the sample data S; setting the neighborhood with a radius of Eps of the certain data point p to N Eps (p); clustering the high-density area formed by Eps and Minpts; the clustering of the high-density area formed by Eps and Minpts specifically includes: judging whether the certain data point p has been added to a cluster or has been listed as noise; if the certain data point p has been added to a cluster or has been listed as noise, the classification is terminated; if the certain data point p has not been added to a cluster and has not been listed as noise, judging N Eps (p) contains at least Minpts objects; if there are at least Minpts objects in N Eps (p), a new cluster U is constructed and the data point p is added to U; if the number of objects in N Eps (p) is less than Minpts, the data point p is listed as a boundary point or noise.

在本实施例中,以p点为中心,Eps为半径的一个圈,在这个圈里所有点的个数是否大于Minpts,若满足则保留,若不满足则视为噪声。NEps(p),指遍历N个点,即所有点。DBSCAN聚类算法的参数设置比较复杂,需要进行联合反复调整。具体实施例中,采用将数据进行归一化后,将Eps设置为4.5,将Minpts设置为18.5,去除噪声。In this embodiment, a circle with point p as the center and Eps as the radius is formed. Whether the number of all points in this circle is greater than Minpts, if it is satisfied, it is retained, if not, it is regarded as noise. N Eps (p) refers to traversing N points, that is, all points. The parameter setting of the DBSCAN clustering algorithm is relatively complex and requires joint repeated adjustment. In a specific embodiment, after normalizing the data, Eps is set to 4.5 and Minpts is set to 18.5 to remove noise.

图2示出了本发明的一个实施例的SCADA数据的参数筛选与建模的示意流程图。如图2所示,首先利用卡方验证对某一段时间的风电叶片断裂前后的SCADA数据进行初步筛选,记录风电叶片故障时叶片总体故障数据个数及SCADA中各参数数据异常个数,记录到下表中:FIG2 shows a schematic flow chart of parameter screening and modeling of SCADA data in one embodiment of the present invention. As shown in FIG2, firstly, the SCADA data before and after the wind turbine blade fracture in a certain period of time is preliminarily screened using chi-square verification, and the number of overall blade failure data and the number of abnormal parameter data in SCADA are recorded when the wind turbine blade fails, and recorded in the following table:

根据实际值计算理论值,记录到表中:Calculate the theoretical value based on the actual value and record it in the table:

假设H0:风电叶片的运行状态与其他参数之间是独立的;计算检验统计量来衡量实际值与理论值的差异程度,卡方值计算公式求得卡方值;根据自由度和卡方值查表,找到对应的P值,决定对原假设H0拒绝或接受,初步筛选出与叶片运行状态有关的SCADA参数;选择P值小于0.05的参数作为与风电叶片运行有关的参数,即拒绝原假设,风电叶片的运行状态与其他参数之间是有关联的。Assumption H0: The operating status of wind turbine blades is independent of other parameters; calculate the test statistic to measure the difference between the actual value and the theoretical value, and use the chi-square value calculation formula to obtain the chi-square value; look up the table based on the degrees of freedom and chi-square value to find the corresponding P value, and decide whether to reject or accept the null hypothesis H0, and preliminarily screen out the SCADA parameters related to the operating status of the blades; select parameters with a P value less than 0.05 as parameters related to the operation of wind turbine blades, that is, reject the null hypothesis, and there is a correlation between the operating status of wind turbine blades and other parameters.

具体实施例中,统计叶片断裂前后各SCADA参数正常及异常的个数,并通过以上算法筛选得到的与叶片运行状态有关的参数如下表所示,将原来的几十个参数筛选后,即使用卡方验证筛选出的与风电叶片运行状态有关的各个参数分别为:得到功率、发电机转速、转子转速、叶片角度、网侧电流1、网侧电流2、网侧电流3,共7个参数,实现了对数据的降维处理。In a specific embodiment, the number of normal and abnormal SCADA parameters before and after the blade breaks is counted, and the parameters related to the blade operating status obtained by filtering through the above algorithm are shown in the following table. After filtering the original dozens of parameters, the parameters related to the wind turbine blade operating status screened out using chi-square verification are: power, generator speed, rotor speed, blade angle, grid-side current 1, grid-side current 2, grid-side current 3, a total of 7 parameters, realizing dimensionality reduction processing of the data.

进一步地,如图2所示,因为卡方验证没有得到风速与叶片状态之间的关系,但为了对SCADA数据进行工况分类,加入风速这一参数。采用DBSCAN聚类算法对风速-功率散点图进行数据清洗,去除空值、奇点、限功率数据等。如图3所示,数据筛选前建立的模型在叶片断裂后的欧氏距离为0.42,如图4所示数据筛选后建立的模型在叶片断裂后的欧氏距离为8.24,较数据处理之前效果更佳显著,结果表明,卡方验证和DBSCAN数据清洗对SCADA数据的处理提高了模型的准确率,解决了SCADA数据冗余及噪声问题导致的模型建模时间长及准确率低问题。Furthermore, as shown in Figure 2, because the chi-square validation did not obtain the relationship between wind speed and blade status, in order to classify the working conditions of SCADA data, the wind speed parameter was added. The DBSCAN clustering algorithm was used to clean the wind speed-power scatter plot to remove null values, singular points, power-limited data, etc. As shown in Figure 3, the Euclidean distance of the model established before data screening after the blade broke was 0.42, and the Euclidean distance of the model established after data screening after the blade broke was 8.24 as shown in Figure 4, which was significantly better than before data processing. The results show that the chi-square validation and DBSCAN data cleaning process for SCADA data improves the accuracy of the model and solves the problems of long modeling time and low accuracy caused by SCADA data redundancy and noise.

进一步地,如图2所示,将初步筛选出的风电叶片正常运行状态下的SCADA数据按照7:3分为训练集和验证集搭建NSET模型;求得各个方案过程记忆矩阵Dn,n为正整数;NSET建模是否成功的核心为过程记忆矩阵的构建是否成功。Furthermore, as shown in Figure 2, the SCADA data of wind turbine blades under normal operating conditions that were initially screened out were divided into training set and verification set according to a ratio of 7:3 to build the NSET model; the process memory matrix Dn of each scheme was obtained, where n is a positive integer; the key to whether the NSET modeling is successful is whether the process memory matrix is successfully constructed.

将风电机组的整体观测矩阵表示为一个n×b大小的Mn×b,所述Mn×b的表达式为:The overall observation matrix of the wind turbine is represented as an n×b-sized M n×b , and the expression of M n×b is:

其中,n为时间状态,b为每个时间的观测变量数;矩阵Mn×b的行向量为Xi=[xi(t1)xi(t2)...xi(tb)],矩阵Mn×b的行向量为某一给定观测参数Xi在某个观测时间段内的所有观测值;矩阵Mn×b的列向量为X(tj)=[x1(tj) x2(tj)...xb(tj)]T,矩阵Mn×b的列向量为tj时刻所有观测参数的观测值;从所述Mn×b中选取一段时间的参数记为历史观测矩阵K,历史观测矩阵K为各个观测参数的健康状态,则历史观测矩阵K的表达式为:Wherein, n is the time state, b is the number of observed variables at each time; the row vector of the matrix Mn ×b is Xi = [ xi ( t1 ) xi ( t2 ) ... xi ( tb )], and the row vector of the matrix Mn ×b is all the observed values of a given observation parameter Xi within a certain observation time period; the column vector of the matrix Mn ×b is X( tj ) = [ x1 ( tj ) x2 ( tj ) ... xb ( tj )] T , and the column vector of the matrix Mn ×b is the observed values of all observed parameters at time tj ; the parameters of a period of time selected from the Mn ×b are recorded as the historical observation matrix K, and the historical observation matrix K is the health status of each observation parameter, then the expression of the historical observation matrix K is:

从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn。过程矩阵Dn可表示为:A portion of state data is selected from the historical observation matrix K, and the selected portion of state data is used to form a process memory matrix D n . The process matrix D n can be expressed as:

将观测矩阵Xobs和所述观测矩阵Xobs对应的Dn输入到表达式中,得到预测输出矩阵XestThe observation matrix X obs and D n corresponding to the observation matrix X obs are input into the expression In the above equation, we get the prediction output matrix Xest ;

将初步筛选出的风电叶片正常运行状态下的SCADA数据中7份训练集设置为观测矩阵Xobs;将3份验证集输入到训练好的NSET模型;输入验证集后的NSET模型输出对应的预测输出矩阵Xest;根据观测矩阵Xobs和预测输出矩阵Xest拟合对应的欧氏距离曲线。The seven training sets of the SCADA data of wind turbine blades in normal operation that were initially screened were set as the observation matrix Xobs ; the three validation sets were input into the trained NSET model; the NSET model after inputting the validation set output the corresponding prediction output matrix Xest ; the corresponding Euclidean distance curve was fitted according to the observation matrix Xobs and the prediction output matrix Xest .

进一步地,利用欧氏距离曲线验证模型的有效性,欧氏距离能够体现个体差异的绝对值,如两点坐标为(xi,yi)、(xj,yj)则欧氏距离输入断裂数据(即一分钟的时间的风电叶片断裂前后的SCADA数据,时间戳为0.02s,共3000个数据)验证模型有效性。如图4所示,将叶片断裂前后3000个样本数据输入到由NSET建立的叶片正常状态模型中,在第1500样本点处叶片断裂,数据输入到正常模型后,在1500样本点后欧氏距离曲线大幅上升,该模型可做到了对叶片断裂进行诊断监测。Furthermore, the Euclidean distance curve is used to verify the validity of the model. The Euclidean distance can reflect the absolute value of individual differences. For example, if the coordinates of two points are ( xi , yi ) and ( xj , yj ), then the Euclidean distance is The fracture data (i.e., SCADA data before and after the wind turbine blade fracture for one minute, with a timestamp of 0.02s, a total of 3000 data) are input to verify the validity of the model. As shown in Figure 4, the 3000 sample data before and after the blade fracture are input into the blade normal state model established by NSET. The blade fractures at the 1500th sample point. After the data is input into the normal model, the Euclidean distance curve rises significantly after the 1500 sample points. The model can diagnose and monitor blade fractures.

进一步地,由于卡方验证的局限性使得欧氏距离曲线本该在1500样本处的断裂点发生变化,延迟到了1664样本点处,因此需要进一步筛选。继续对观测矩阵与预测输出矩阵之间的残差En=xn-xn *,进行分析可确定异常参数。其中,xn为风电叶片断裂前后的SCADA数据中的第n个参数的观测值;xn *为风电叶片断裂前后的SCADA数据中的第n个参数对应的预测值;En为所述第n个参数的观测值和所述第n个参数对应的预测值的差;x1至x7为所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数;Furthermore, due to the limitations of the chi-square validation, the break point of the Euclidean distance curve that should have been at 1500 samples has changed and has been delayed to 1664 samples, so further screening is needed. Continuing to analyze the residual E n = x n -x n * between the observation matrix and the prediction output matrix can determine the abnormal parameters. Among them, x n is the observed value of the nth parameter in the SCADA data before and after the wind turbine blade breaks; x n * is the predicted value corresponding to the nth parameter in the SCADA data before and after the wind turbine blade breaks; En is the difference between the observed value of the nth parameter and the predicted value corresponding to the nth parameter; x 1 to x 7 are the various parameters related to the operating status of the wind turbine blade screened out using the chi-square validation;

根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障次数ecCalculating the number of failures e c of each parameter related to the wind turbine blade operation state screened out using chi-square validation according to the residual E n ;

故障次数ec的表达式为:The expression of the number of failures e c is:

令a=观测向量组数/每次输入的观测量组数,并根据所述故障次数ec计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率eiLet a = number of observation vector groups/number of observation quantity groups input each time, and calculate the cumulative fault contribution rate e i of each parameter related to the wind turbine blade operation state screened by using chi-square validation according to the number of faults e c ;

故障累计贡献率ei的表达式为:The expression of the cumulative contribution rate of failure ei is:

去除x1至x7中的所述故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,即去除故障贡献率低的参数,留下对叶片断裂敏感性更高的参数,作为最终建模参数,以最终筛选出与风电叶片运行状态有关的各个参数;其中,最终筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速和风电叶片角度。The parameters whose cumulative fault contribution rate e i is less than a certain set value among x 1 to x 7 are removed, and the wind speed parameter is added to the parameters related to the operating state of the wind turbine blades after the removal, that is, the parameters with low fault contribution rate are removed, and the parameters with higher sensitivity to blade breakage are left as the final modeling parameters, so as to finally screen out the parameters related to the operating state of the wind turbine blades; wherein the parameters related to the operating state of the wind turbine blades finally screened out are: wind speed, wind turbine power, generator speed, rotor speed and wind turbine blade angle.

向模型中输入叶片断裂数据(即一分钟的时间的风电叶片断裂前后的SCADA数据,时间戳为0.02s,共3000个数据),其整体欧氏距离结果如图5所示。由图4和图5比较可以看出,图5的欧氏距离曲线在断裂前欧氏距离有较大波动,且在1500样本点处就开始发生变化,更加符合实际工况,更加准确。利用可靠性分析有依据地对参数进行筛选,解决了建模参数可靠性问题。Input the blade fracture data into the model (i.e., SCADA data before and after the wind turbine blade fracture for one minute, with a timestamp of 0.02s, and a total of 3000 data), and the overall Euclidean distance result is shown in Figure 5. Comparing Figures 4 and 5, it can be seen that the Euclidean distance curve in Figure 5 has a large fluctuation before the fracture, and begins to change at 1500 sample points, which is more in line with the actual working conditions and more accurate. The reliability analysis is used to screen the parameters based on the reliability of the modeling parameters, solving the problem of modeling parameter reliability.

图6示出了本发明的一个实施例的过程记忆矩阵构造程序的示意流程图。从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn。过程记忆矩阵D的构造需要使其内部的k个观测向量X(1) X(2)...X(k)能够尽量覆盖设备正常工作空间。FIG6 shows a schematic flow chart of a process memory matrix construction procedure of an embodiment of the present invention. A portion of state data is selected from the historical observation matrix K, and the selected portion of state data is used to form a process memory matrix Dn . The process memory matrix D is constructed so that its internal k observation vectors X(1) X(2)...X(k) can cover the normal working space of the device as much as possible.

所述从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn,具体包括:将历史观测矩阵K的每一个观测向量设置为由n个变量组成;对所述n个变量中的每一个变量,将[0,1]之间等分为h份,以1/h为步距从所述历史观测矩阵K中查找出若干个观测向量X(1) X(2)...X(k)加入所述过程记忆矩阵Dn中;The method of selecting a part of state data from the historical observation matrix K and forming a process memory matrix Dn with the selected part of state data specifically includes: setting each observation vector of the historical observation matrix K to be composed of n variables; for each of the n variables, dividing the interval [0, 1] into h equal parts, and finding a number of observation vectors X(1) X(2) ... X(k) from the historical observation matrix K with a step size of 1/h and adding them to the process memory matrix Dn ;

如图6所示,所述对所述n个变量中的每一个变量,将[0,1]之间等分为h份,以1/h为步距从所述历史观测矩阵K中查找出若干个观测向量X(1)X(2)...X(k)加入所述过程记忆矩阵Dn中,具体包括:设置i=1;其中,i为正整数;执行A=1/h*i;其中,h为正整数;设置k=1;其中,k为正整数;判断|X(k)-A|是否小于δ;其中,δ为一正数;当|X(k)-A|小于δ时,添加X(k)到所述过程记忆矩阵Dn中;当|X(k)-A|大于等于δ时,判断k是否大于M;其中,M为所述历史观测矩阵K的列数;当k小于等于M时,执行k=k+1,返回所述判断|X(k)-A|是否小于δ的步骤;当k大于M时,判断i是否大于h;当i小于等于h时,执行i=i+1,返回所述执行A=1/h*i的步骤;当i大于h时,执行结束。As shown in FIG6 , for each of the n variables, the interval [0, 1] is equally divided into h parts, and a number of observation vectors X(1)X(2)...X(k) are found from the historical observation matrix K with a step size of 1/h and added to the process memory matrix D n , specifically including: setting i=1; wherein i is a positive integer; executing A=1/h*i; wherein h is a positive integer; setting k=1; wherein k is a positive integer; determining whether |X(k)-A| is less than δ; wherein δ is a positive number; when |X(k)-A| is less than δ, adding X(k) to the process memory matrix D n n ; when |X(k)-A| is greater than or equal to δ, determine whether k is greater than M; wherein M is the number of columns of the historical observation matrix K; when k is less than or equal to M, execute k=k+1, and return to the step of determining whether |X(k)-A| is less than δ; when k is greater than M, determine whether i is greater than h; when i is less than or equal to h, execute i=i+1, and return to the step of executing A=1/h*i; when i is greater than h, execution ends.

具体实施例中,δ为0.001。In a specific embodiment, δ is 0.001.

设备正常工作空间的每一个观测向量由n个变量组成,且其观测值已被归一化。对每一个变量,将[0,1]之间等分为h份,以1/h为步距从集合K中查找出若干个观测向量加入矩阵D中,图中δ为一小的正数。对剩余的n-1个变量,均采用与图示相同的流程以1/h为步距从集合K中选择观测向量添加到D中,采用此方法构造过程记忆矩阵,能够将组成观测向量的n个变量的不同测量值对应的历史记录选入矩阵D中,从而使其能较好地覆盖设备正常工作空间。Each observation vector of the normal working space of the equipment is composed of n variables, and its observation values have been normalized. For each variable, [0,1] is divided into h equal parts, and several observation vectors are found from the set K with a step size of 1/h and added to the matrix D. In the figure, δ is a small positive number. For the remaining n-1 variables, the same process as the diagram is used to select observation vectors from the set K with a step size of 1/h and add them to D. This method is used to construct a process memory matrix, which can select the historical records corresponding to the different measurement values of the n variables that make up the observation vector into the matrix D, so that it can better cover the normal working space of the equipment.

图7示出了本发明的一个实施例的风电叶片断裂实时监测的示意流程图。如图7所示,将最终筛选出的参数对应的清洗后的数据按照7:3分为训练集和验证集进行第二次NSET建模,将其作为最佳模型对筛选出的参数进行数据融合,输入验证集获得最大正常欧氏距离阈值并记录;输入断裂数据获得叶片断裂欧氏距离并记录。输入实时SCADA参数比较实时数据与历史模型的欧氏距离曲线,判断是否超过叶片断裂预警阈值,进行叶片断裂分级预警。FIG7 shows a schematic flow chart of real-time monitoring of wind turbine blade fracture according to an embodiment of the present invention. As shown in FIG7 , the cleaned data corresponding to the finally screened parameters are divided into a training set and a validation set according to a ratio of 7:3 for a second NSET modeling, and the training set and the validation set are used as the best model for data fusion of the screened parameters, and the validation set is input to obtain the maximum normal Euclidean distance threshold and recorded; the fracture data is input to obtain the blade fracture Euclidean distance and recorded. The real-time SCADA parameters are input to compare the real-time data with the Euclidean distance curve of the historical model to determine whether the blade fracture warning threshold is exceeded, and a graded blade fracture warning is performed.

在本实施例中,如图8所示,该模型最大正常欧氏距离阈值为0.13。如图5所示,在叶片断裂处欧氏距离曲线大幅上升,叶片断裂后的欧氏距离值为8.24,做到了对叶片损伤进行量化,解决了对叶片不同程度损伤的量化问题。In this embodiment, as shown in Figure 8, the maximum normal Euclidean distance threshold of the model is 0.13. As shown in Figure 5, the Euclidean distance curve rises sharply at the blade break, and the Euclidean distance value after the blade break is 8.24, which quantifies the blade damage and solves the problem of quantifying different degrees of blade damage.

图5示出了本发明的一个实施例的叶片预警阈值划分分布图。如图5所示,将叶片断裂前后几个月的数据作为输入,输入到最终建立的叶片正常工况下的NSET模型中,通过欧氏距离的变化进行断裂阈值划分,且由于不同型号机组和不同长度叶片所得到的预警阈值不都相同,需对目标监测风电叶片断裂前欧氏距离曲线进行比较得到最佳预警阈值。具体地,将欧氏距离0到验证集最大值之间的欧氏距离作为叶片运行状态正常的变化范围,将验证集最大值到叶片断裂前的欧氏距离最大值之间的值作为查看叶片状况的变化范围,并将叶片断裂前的欧氏距离最大值作为叶片紧急查看依据,将叶片断裂后的欧氏距离值作为叶片已断裂信号,不同型号预警阈值可能存在差异。具体实施例中,把1-3月12日的数据输入到最佳模型中,因为3月12日叶片断裂,把主要分析放到断裂前的时间。如图5、图8、图9、图10至图13所示,叶片正常状态下运行的欧氏距离最大阈值为0.13,叶片断裂前的最大欧氏距离为2.21,叶片断裂后,欧氏距离为8.24。因此,可将欧氏距离2.21作为叶片断裂预警阈值,当欧氏距离在0.13-2.21之间时需择时对叶片进行查看,当欧氏距离到达2.21时应及时停机查看,解决了叶片断裂预警阈值的设置问题。FIG5 shows a distribution diagram of blade warning threshold division according to an embodiment of the present invention. As shown in FIG5, the data of several months before and after the blade fracture is used as input and input into the NSET model under the normal working condition of the blade that is finally established, and the fracture threshold is divided by the change of Euclidean distance. Since the warning thresholds obtained by units of different models and blades of different lengths are not all the same, it is necessary to compare the Euclidean distance curve before the target monitoring wind turbine blade fracture to obtain the best warning threshold. Specifically, the Euclidean distance between 0 and the maximum value of the verification set is used as the normal range of blade operation status, the value between the maximum value of the verification set and the maximum value of the Euclidean distance before the blade fracture is used as the range of blade status inspection, and the maximum value of the Euclidean distance before the blade fracture is used as the basis for emergency inspection of the blade, and the Euclidean distance value after the blade fracture is used as the signal that the blade has fractured. There may be differences in the warning thresholds of different models. In the specific embodiment, the data from January to March 12 are input into the best model. Because the blade fractured on March 12, the main analysis is placed on the time before the fracture. As shown in Figures 5, 8, 9, 10 to 13, the maximum threshold of the Euclidean distance when the blade is running normally is 0.13, the maximum Euclidean distance before the blade breaks is 2.21, and the Euclidean distance after the blade breaks is 8.24. Therefore, the Euclidean distance of 2.21 can be used as the blade break warning threshold. When the Euclidean distance is between 0.13 and 2.21, the blade needs to be checked at an appropriate time. When the Euclidean distance reaches 2.21, the machine should be stopped for inspection in time, which solves the problem of setting the blade break warning threshold.

如图14所示,一种计算机装置1400包括:存储器1402、处理器1404及存储在存储器1402上并可在处理器1404上运行的计算机程序,处理器1404执行计算机程序时实现如上述任一实施例中的方法的步骤。As shown in FIG. 14 , a computer device 1400 includes: a memory 1402 , a processor 1404 , and a computer program stored in the memory 1402 and executable on the processor 1404 . When the processor 1404 executes the computer program, the steps of the method in any of the above embodiments are implemented.

本发明提供的计算机装置1400,处理器1404执行计算机程序时,通过卡方验证将叶片与SCADA系统中的传感器信号进行关联量化初步筛选,再利用故障数据对模型的输入参数进行进一步验证筛选,最后将筛选出来的SCADA参数建立一个风电叶片正常状态下的NSET模型作为半监督模型对叶片运行状态进行监测,利用已知风电叶片断裂数据集验证建模结果并设置预警阈值,可以在已有的SCADA数据基础上充分挖掘其叶片断裂预警的阈值,从而达到在线实时高效监测的目的。The computer device 1400 provided by the present invention, when the processor 1404 executes the computer program, associates and quantifies the blades with the sensor signals in the SCADA system through chi-square verification for preliminary screening, and then uses the fault data to further verify and screen the input parameters of the model. Finally, the screened SCADA parameters are used to establish an NSET model of a wind turbine blade in a normal state as a semi-supervised model to monitor the blade operation state. The modeling results are verified using a known wind turbine blade fracture data set and an early warning threshold is set. The threshold of the blade fracture early warning can be fully mined based on the existing SCADA data, thereby achieving the purpose of online real-time and efficient monitoring.

本发明还提出了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现如上述任一实施例中的用于油梁式抽油机的频率闭环控制方法的步骤。The present invention also proposes a computer-readable storage medium having a computer program stored thereon. When the computer program is executed by a processor, the steps of the frequency closed-loop control method for an oil beam type pumping unit as described in any of the above embodiments are implemented.

本发明提供的计算机可读存储介质,计算机程序被处理器执行时,通过卡方验证将叶片与SCADA系统中的传感器信号进行关联量化初步筛选,再利用故障数据对模型的输入参数进行进一步验证筛选,最后将筛选出来的SCADA参数建立一个风电叶片正常状态下的NSET模型作为半监督模型对叶片运行状态进行监测,利用已知风电叶片断裂数据集验证建模结果并设置预警阈值,可以在已有的SCADA数据基础上充分挖掘其叶片断裂预警的阈值,从而达到在线实时高效监测的目的。The computer-readable storage medium provided by the present invention, when the computer program is executed by the processor, performs preliminary screening of the association quantification of blades and sensor signals in the SCADA system through chi-square verification, and then uses fault data to further verify and screen the input parameters of the model, and finally uses the screened SCADA parameters to establish an NSET model of a wind turbine blade in a normal state as a semi-supervised model to monitor the blade operation state, and uses a known wind turbine blade fracture data set to verify the modeling results and set the warning threshold, so that the threshold of the blade fracture warning can be fully mined on the basis of the existing SCADA data, thereby achieving the purpose of online real-time and efficient monitoring.

以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims (10)

1.一种基于SCADA数据的关联分析的风电叶片断裂预警方法,所述SCADA数据是风电机组的SCADA系统中的传感器组采集到的数据,所述SCADA数据包括的各个参数为:风电机组的功率、发电机转速、转子转速、风电叶片角度、网侧电流和风速;其特征在于,所述预警方法包括:1. A wind turbine blade fracture early warning method based on correlation analysis of SCADA data, wherein the SCADA data is data collected by a sensor group in a SCADA system of a wind turbine generator set, and the various parameters included in the SCADA data are: wind turbine generator set power, generator speed, rotor speed, wind turbine blade angle, grid-side current and wind speed; characterized in that the early warning method comprises: 获取某一段时间的风电叶片断裂前后的第一组SCADA数据;Obtain the first set of SCADA data before and after the wind turbine blade breaks over a certain period of time; 使用卡方验证对所述第一组SCADA数据进行筛选,以使用卡方验证筛选出与风电叶片运行状态有关的各个参数;Using chi-square validation to screen the first group of SCADA data, so as to screen out various parameters related to the operating status of the wind turbine blades using chi-square validation; 将风速这一参数加入到使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中,以第一次筛选出与风电叶片运行状态有关的各个参数;The wind speed parameter is added to various parameters related to the operating state of wind turbine blades screened out using chi-square validation, so as to screen out various parameters related to the operating state of wind turbine blades for the first time; 获取某一段时间的风电叶片正常运行状态下的第二组SCADA数据;Obtain a second set of SCADA data of wind turbine blades in normal operating state for a certain period of time; 对所述第二组SCADA数据中的所述第一次筛选出与风电叶片运行状态有关的各个参数对应的原始数据进行提取;Extracting original data corresponding to each parameter related to the operating state of the wind turbine blade screened out for the first time from the second group of SCADA data; 根据数据提取后的第二组SCADA数据中的风速的数据和风电机组的功率的数据,拟合风速-功率散点图;According to the wind speed data and the power data of the wind turbine generator set in the second group of SCADA data after data extraction, a wind speed-power scatter plot is fitted; 采用DBSCAN聚类算法对所述风速-功率散点图进行数据清洗;The DBSCAN clustering algorithm is used to perform data cleaning on the wind speed-power scatter plot; 将数据清洗后的第二组SCADA数据划分为第一训练集和第一验证集;The second group of SCADA data after data cleaning is divided into a first training set and a first validation set; 根据所述第一训练集和第一验证集,搭建风电叶片正常运行状态下的第一个NSET模型;According to the first training set and the first validation set, a first NSET model is constructed under normal operating conditions of the wind turbine blade; 将原始的第一组SCADA数据输入到所述第一个NSET模型,计算所述第一个NSET模型输出的预测值和对应的观测值之间的残差En,以及根据所述预测值和观测值去拟合所述第一组SCADA数据对应的第一欧氏距离曲线,以观察到所述第一欧氏距离曲线在断裂点延迟的某个位置后开始处于上升趋势;Inputting the original first set of SCADA data into the first NSET model, calculating the residual E n between the predicted value output by the first NSET model and the corresponding observed value, and fitting the first Euclidean distance curve corresponding to the first set of SCADA data according to the predicted value and the observed value, so as to observe that the first Euclidean distance curve begins to be in an upward trend after a certain position of the breakpoint delay; 根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率eiCalculating the cumulative fault contribution rate e i of each parameter related to the wind turbine blade operation state screened out using chi-square validation according to the residual E n ; 去除所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中的故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数;Removing the parameters whose cumulative contribution rate e of failure is less than a certain set value from among the parameters related to the operating state of the wind turbine blades screened out by using the chi-square verification, and adding the wind speed parameter to the parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the parameters related to the operating state of the wind turbine blades for the second time; 对数据清洗后的第二组SCADA数据中的所述第二次筛选出与风电叶片运行状态有关的各个参数对应的数据进行提取,并将提取到的数据划分为第二训练集和第二验证集;Extracting data corresponding to the parameters related to the operating state of the wind turbine blades screened out for the second time from the second group of SCADA data after data cleaning, and dividing the extracted data into a second training set and a second validation set; 根据所述第二训练集和第二验证集,建立风电叶片正常运行状态下的第二个NSET模型;According to the second training set and the second validation set, a second NSET model is established under the normal operating state of the wind turbine blade; 获取所述第二验证集对应的欧氏距离曲线,并将所述第二验证集对应的欧氏距离曲线上的欧氏距离的最大值作为第一阈值;Obtaining a Euclidean distance curve corresponding to the second validation set, and taking the maximum value of the Euclidean distance on the Euclidean distance curve corresponding to the second validation set as a first threshold; 将原始的第一组SCADA数据输入到所述第二个NSET模型,计算所述第二个NSET模型输出的预测值和对应的观测值之间的残差En',以及根据所述预测值和观测值去拟合所述第一组SCADA数据对应的第二欧氏距离曲线,以观察到所述第二欧氏距离曲线在断裂点的位置开始处于上升趋势;Inputting the original first group of SCADA data into the second NSET model, calculating the residual E n ' between the predicted value output by the second NSET model and the corresponding observed value, and fitting the second Euclidean distance curve corresponding to the first group of SCADA data according to the predicted value and the observed value, so as to observe that the second Euclidean distance curve starts to be in an upward trend at the position of the breakpoint; 获取所述断裂点之前的第二欧氏距离曲线上的欧氏距离的最大值作为第二阈值;obtaining a maximum value of the Euclidean distance on the second Euclidean distance curve before the breaking point as a second threshold; 获取所述断裂点之后的第二欧氏距离曲线上的欧氏距离的最大值作为第三阈值;obtaining a maximum value of the Euclidean distance on the second Euclidean distance curve after the breakpoint as a third threshold; 实时获取风电机组的SCADA系统中的传感器组采集到的SCADA数据:Real-time acquisition of SCADA data collected by the sensor group in the SCADA system of the wind turbine: 将实时获取到的SCADA数据输入到所述第二个NSET模型,以及计算所述第二个NSET模型输出的预测值和对应的观测值之间的残差En”,以及根据所述预测值和观测值去拟合所述实时获取到的SCADA数据对应的第三欧氏距离曲线;Inputting the SCADA data acquired in real time into the second NSET model, calculating the residual E n ″ between the predicted value output by the second NSET model and the corresponding observed value, and fitting a third Euclidean distance curve corresponding to the SCADA data acquired in real time according to the predicted value and the observed value; 根据所述第三欧氏距离曲线上对应的各个欧氏距离的值,以及第一阈值、第二阈值、第三阈值,对风电机组的风电叶片进行断裂预警。According to the values of the corresponding Euclidean distances on the third Euclidean distance curve, as well as the first threshold, the second threshold, and the third threshold, a fracture warning is issued for the wind turbine blades of the wind turbine generator set. 2.根据权利要求1所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,所述根据所述第三欧氏距离曲线上的欧氏距离的值,以及第一阈值、第二阈值、第三阈值,对风电机组的风电叶片进行断裂预警,具体包括:2. The wind turbine blade fracture early warning method based on association analysis of SCADA data according to claim 1 is characterized in that the wind turbine blade fracture early warning of the wind turbine generator set is performed according to the value of the Euclidean distance on the third Euclidean distance curve, and the first threshold, the second threshold, and the third threshold, specifically comprising: 确定所述第三欧氏距离曲线上对应的各个欧氏距离的值所处的范围;Determine the range of the values of each corresponding Euclidean distance on the third Euclidean distance curve; 当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于等于0且小于等于所述第一阈值时,则判断风电机组的风电叶片处于正常运行状态;When the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than or equal to 0 and less than or equal to the first threshold, it is judged that the wind turbine blades of the wind turbine generator set are in a normal operating state; 当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于所述第一阈值且小于等于所述第二阈值时,则判断风电机组的风电叶片处于异常状态;When the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than the first threshold value and less than or equal to the second threshold value, it is determined that the wind turbine blade of the wind turbine generator set is in an abnormal state; 当所述第三欧氏距离曲线上对应的某个欧氏距离的值大于等于所述第三阈值时,则判断风电机组的风电叶片处于已断裂状态。When the value of a certain Euclidean distance corresponding to the third Euclidean distance curve is greater than or equal to the third threshold, it is determined that the wind turbine blade of the wind turbine generator set is in a broken state. 3.根据权利要求1所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,3. The wind turbine blade fracture early warning method based on correlation analysis of SCADA data according to claim 1 is characterized in that: 所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数分别为:风电机组的功率、发电机转速、转子转速、风电叶片角度、第一网侧电流、第二网侧电流和第三网侧电流;The parameters related to the operation status of wind turbine blades screened out by using chi-square verification are: wind turbine power, generator speed, rotor speed, wind turbine blade angle, first grid-side current, second grid-side current and third grid-side current; 所述第一次筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速、风电叶片角度、第一网侧电流、第二网侧电流和第三网侧电流;The first screened out parameters related to the operating state of the wind turbine blades are: wind speed, wind turbine power, generator speed, rotor speed, wind turbine blade angle, first grid-side current, second grid-side current and third grid-side current; 所述残差En的表达式为:The expression of the residual E n is: En=xn-xn * En = xn - xn * 其中,xn为所述第一组SCADA数据中的第n个参数的观测值;xn *为所述第一组SCADA数据中的第n个参数对应的预测值;En为所述第n个参数的观测值和所述第n个参数对应的预测值的差;x1至x7为所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数;以及Wherein, xn is the observed value of the nth parameter in the first set of SCADA data; xn * is the predicted value corresponding to the nth parameter in the first set of SCADA data; En is the difference between the observed value of the nth parameter and the predicted value corresponding to the nth parameter; x1 to x7 are the parameters related to the operating state of the wind turbine blade screened using the chi-square validation; and 所述根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率ei,具体包括:The calculating, according to the residual E n , the fault cumulative contribution rate e i of each parameter related to the wind turbine blade operation state screened out by using the chi-square validation specifically includes: 根据所述残差En计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障次数ecCalculating the number of failures e c of each parameter related to the wind turbine blade operation state selected by using chi-square validation according to the residual E n ; 所述故障次数ec的表达式为:The expression of the fault number e c is: 令a=观测向量组数/每次输入的观测量组数,并根据所述故障次数ec计算所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数的故障累计贡献率eiLet a = number of observation vector groups/number of observation quantity groups input each time, and calculate the cumulative fault contribution rate e i of each parameter related to the wind turbine blade operation state screened by using chi-square validation according to the number of faults e c ; 所述故障累计贡献率ei的表达式为:The expression of the cumulative contribution rate of faults e i is: 以及所述去除所述使用卡方验证筛选出的与风电叶片运行状态有关的各个参数中的故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数,具体包括:And the removing of the parameters whose cumulative contribution rate of faults e i is less than a certain set value from the various parameters related to the operating state of the wind turbine blades screened out by using the chi-square verification, and adding the wind speed parameter to the various parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the various parameters related to the operating state of the wind turbine blades for the second time, specifically includes: 去除x1至x7中的所述故障累计贡献率ei小于某一设定值的参数,并将风速这一参数加入到去除之后的与风电叶片运行状态有关的各个参数中,以第二次筛选出与风电叶片运行状态有关的各个参数;其中,所述第二次筛选出与风电叶片运行状态有关的各个参数分别为:风速、风电机组的功率、发电机转速、转子转速和风电叶片角度。The parameters x1 to x7 whose cumulative fault contribution rate e i is less than a set value are removed, and the wind speed parameter is added to the parameters related to the operating state of the wind turbine blades after the removal, so as to screen out the parameters related to the operating state of the wind turbine blades for the second time; wherein the parameters related to the operating state of the wind turbine blades screened out for the second time are: wind speed, wind turbine power, generator speed, rotor speed and wind turbine blade angle. 4.根据权利要求1至3中任一项所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,所述根据所述第一训练集和第一验证集,建立风电叶片正常运行状态下的第一个NSET模型,具体包括:4. The wind turbine blade fracture early warning method based on association analysis of SCADA data according to any one of claims 1 to 3, characterized in that the first NSET model under the normal operating state of the wind turbine blade is established according to the first training set and the first validation set, specifically comprising: 获取原始的NSET模型,将所述第一训练集输入原始的NSET模型,以得到训练好的NSET模型;Obtaining an original NSET model, and inputting the first training set into the original NSET model to obtain a trained NSET model; 将风电机组的整体观测矩阵表示为一个n×b大小的Mn×b,所述Mn×b的表达式为:The overall observation matrix of the wind turbine is represented as an n×b-sized M n×b , and the expression of M n×b is: 其中,n为时间状态,b为每个时间的观测变量数;矩阵Mn×b的行向量为Xi=[xi(t1) xi(t2) ... xi(tb)],矩阵Mn×b的行向量为某一给定观测参数Xi在某个观测时间段内的所有观测值;矩阵Mn×b的列向量为X(tj)=[x1(tj) x2(tj) ... xb(tj)]T,矩阵Mn×b的列向量为tj时刻所有观测参数的观测值;Where n is the time state, b is the number of observed variables at each time; the row vector of the matrix Mn ×b is Xi = [ xi ( t1 ) xi ( t2 ) ... xi ( tb )], and the row vector of the matrix Mn ×b is all the observed values of a given observation parameter Xi in a certain observation time period; the column vector of the matrix Mn ×b is X( tj ) = [ x1 ( tj ) x2 ( tj ) ... xb ( tj )] T , and the column vector of the matrix Mn ×b is the observed values of all observation parameters at time tj ; 从所述Mn×b中选取一段时间的参数记为历史观测矩阵K,历史观测矩阵K为各个观测参数的健康状态,则历史观测矩阵K的表达式为:The parameters of a period of time are selected from the M n×b and recorded as the historical observation matrix K. The historical observation matrix K is the health status of each observation parameter. The expression of the historical observation matrix K is: 从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn。过程矩阵Dn可表示为:A portion of state data is selected from the historical observation matrix K, and the selected portion of state data is used to form a process memory matrix D n . The process matrix D n can be expressed as: 将观测矩阵Xobs和所述观测矩阵Xobs对应的Dn输入到表达式中,得到预测输出矩阵XestThe observation matrix X obs and D n corresponding to the observation matrix X obs are input into the expression In the above equation, we get the prediction output matrix Xest ; 将所述第一验证集设置为观测矩阵XobsThe first validation set is set as the observation matrix X obs ; 将所述第一验证集输入到训练好的NSET模型;Input the first validation set into the trained NSET model; 输入第一验证集后的NSET模型输出对应的预测输出矩阵XestThe prediction output matrix Xest corresponding to the output of the NSET model after inputting the first validation set; 计算所述第一验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest之间的残差,以及根据所述第一验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest去拟合所述第一验证集对应的欧氏距离曲线,以建立风电叶片正常运行状态下的第一个NSET模型;以及Calculating the residual between the observation matrix X obs corresponding to the first verification set and the corresponding prediction output matrix X est , and fitting the Euclidean distance curve corresponding to the first verification set according to the observation matrix X obs corresponding to the first verification set and the corresponding prediction output matrix X est , so as to establish a first NSET model under normal operating conditions of the wind turbine blade; and 所述根据所述第二训练集和第二验证集,建立风电叶片正常运行状态下的第二个NSET模型,具体包括:The step of establishing a second NSET model under normal operation of the wind turbine blade according to the second training set and the second validation set specifically includes: 获取原始的NSET模型,将所述第二训练集输入原始的NSET模型,以得到训练好的NSET模型;Obtaining an original NSET model, and inputting the second training set into the original NSET model to obtain a trained NSET model; 将所述第二验证集设置为观测矩阵XobsThe second validation set is set as the observation matrix X obs ; 将所述第二验证集输入到训练好的NSET模型;Input the second validation set into the trained NSET model; 输入第二验证集后的NSET模型输出对应的预测输出矩阵XestThe prediction output matrix Xest corresponding to the output of the NSET model after inputting the second validation set; 计算所述第二验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest之间的残差,以及根据所述第二验证集对应的观测矩阵Xobs与对应的预测输出矩阵Xest去拟合所述第二验证集对应的欧氏距离曲线,以建立风电叶片正常运行状态下的第二个NSET模型。The residual between the observation matrix X obs corresponding to the second verification set and the corresponding prediction output matrix X est is calculated, and the Euclidean distance curve corresponding to the second verification set is fitted according to the observation matrix X obs corresponding to the second verification set and the corresponding prediction output matrix X est , so as to establish a second NSET model under normal operating state of the wind turbine blade. 5.根据权利要求4所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,所述从历史观测矩阵K中选出一部分状态数据,用选出的所述一部分状态数据构成过程记忆矩阵Dn,具体包括:5. The wind turbine blade fracture early warning method based on SCADA data correlation analysis according to claim 4 is characterized in that the selecting a part of state data from the historical observation matrix K and using the selected part of state data to form a process memory matrix D n specifically includes: 将历史观测矩阵K的每一个观测向量设置为由n个变量组成;Set each observation vector of the historical observation matrix K to consist of n variables; 对所述n个变量中的每一个变量,将[0,1]之间等分为h份,以1/h为步距从所述历史观测矩阵K中查找出若干个观测向量X(1) X(2) ... X(k)加入所述过程记忆矩阵Dn中;For each of the n variables, the interval [0, 1] is divided into h equal parts, and a number of observation vectors X(1) X(2) ... X(k) are found from the historical observation matrix K with a step size of 1/h and added to the process memory matrix Dn ; 所述对所述n个变量中的每一个变量,将[0,1]之间等分为h份,以1/h为步距从所述历史观测矩阵K中查找出若干个观测向量X(1) X(2) ... X(k)加入所述过程记忆矩阵Dn中,具体包括:For each of the n variables, the interval [0, 1] is divided into h equal parts, and a plurality of observation vectors X(1) X(2) ... X(k) are found from the historical observation matrix K with a step size of 1/h and added to the process memory matrix D n , specifically including: 设置i=1;其中,i为正整数;Set i=1; where i is a positive integer; 执行A=1/h*i;其中,h为正整数;Execute A=1/h*i; where h is a positive integer; 设置k=1;其中,k为正整数;Set k=1; where k is a positive integer; 判断|X(k)-A|是否小于δ;其中,δ为一正数;Determine whether |X(k)-A| is less than δ; where δ is a positive number; 当|X(k)-A|小于δ时,添加X(k)到所述过程记忆矩阵Dn中;When |X(k)-A| is less than δ, add X(k) to the process memory matrix Dn; 当|X(k)-A|大于等于δ时,判断k是否大于M;其中,M为所述历史观测矩阵K的列数;When |X(k)-A| is greater than or equal to δ, determine whether k is greater than M; wherein M is the number of columns of the historical observation matrix K; 当k小于等于M时,执行k=k+1,返回所述判断|X(k)-A|是否小于δ的步骤;When k is less than or equal to M, execute k=k+1, and return to the step of determining whether |X(k)-A| is less than δ; 当k大于M时,判断i是否大于h;When k is greater than M, determine whether i is greater than h; 当i小于等于h时,执行i=i+1,返回所述执行A=1/h*i的步骤;When i is less than or equal to h, execute i=i+1, and return to the step of executing A=1/h*i; 当i大于h时,执行结束。When i is greater than h, execution ends. 6.根据权利要求1所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,所述使用卡方验证对所述第一组SCADA数据进行筛选,以使用卡方验证筛选出与风电叶片运行状态有关的各个参数,具体包括:6. The wind turbine blade fracture early warning method based on correlation analysis of SCADA data according to claim 1 is characterized in that the first set of SCADA data is screened using chi-square validation to screen out various parameters related to the operating status of the wind turbine blade using chi-square validation, specifically including: 建立原始假设H0,所述原始假设H0为所述风电叶片运行状态与所述第一组SCADA数据中的各个参数之间是独立的;Establishing an original hypothesis H0, wherein the original hypothesis H0 is that the operating state of the wind turbine blade is independent of each parameter in the first set of SCADA data; 将风电叶片运行状态的数据作为第一变量阈值;Using the data of the wind turbine blade operating status as the first variable threshold; 每次将所述第一组SCADA数据中的某一个参数作为第二变量阈值;Each time, a parameter in the first set of SCADA data is used as a second variable threshold; 分别记录所述第一变量阈值在风电叶片正常下的数据个数的实际值为a、在风电叶片故障下的数据个数的实际值为b、以及风电叶片的总体数据个数的实际值为a+b;Record respectively the actual value of the number of data of the first variable threshold when the wind turbine blade is normal as a, the actual value of the number of data when the wind turbine blade is faulty as b, and the actual value of the total number of data of the wind turbine blade as a+b; 分别记录所述SCADA数据中的第y个参数在风电叶片正常下的数据个数的实际值为cy、在风电叶片故障下的数据个数的实际值为dy、以及风电叶片的总体数据个数的实际值为cy+dy,其中,y为正整数;Recording respectively the actual value of the number of data of the yth parameter in the SCADA data when the wind turbine blade is normal as cy , the actual value of the number of data when the wind turbine blade is faulty as dy , and the actual value of the total number of data of the wind turbine blade as cy + dy , wherein y is a positive integer; 分别记录所述第一变量阈值和第二变量阈值两者在风电叶片正常下的总数据个数的实际值为a+cy,以及两者在风电叶片故障下的总数据个数的实际值为b+dy,以及两者的风电叶片的总体数据个数的实际值为a+b+cy+dyRecord respectively the actual value of the total number of data of the first variable threshold and the second variable threshold when the wind turbine blade is normal as a+c y , the actual value of the total number of data of the first variable threshold and the second variable threshold when the wind turbine blade is faulty as b+d y , and the actual value of the total number of data of the wind turbine blade as a+b+c y +d y ; 分别计算得到所述第一变量阈值在风电叶片正常下的数据个数的理论值为(a+b)×(a+cy)/(a+b+cy+dy)、在风电叶片故障下的数据个数的理论值为(a+b)×(b+dy)/(a+b+cy+dy)、以及风电叶片的总体数据个数的理论值为a+b;The theoretical value of the number of data of the first variable threshold when the wind turbine blade is normal is calculated to be (a+b)×(a+c y )/(a+b+c y + dy ), the theoretical value of the number of data when the wind turbine blade is faulty is calculated to be (a+b)×(b+d y )/(a+b+c y + dy ), and the theoretical value of the total number of data of the wind turbine blade is a+b; 分别计算得到所述第二变量阈值在风电叶片正常下的数据个数的理论值为(cy+dy)×(a+cy)/(a+b+cy+dy)、在风电叶片故障下的数据个数的理论值为(cy+dy)×(b+dy)/(a+b+cy+dy)、以及风电叶片的总体数据个数的理论值为cy+dyThe theoretical value of the number of data of the second variable threshold when the wind turbine blade is normal is calculated to be ( cy + dy )×(a+ cy )/(a+b+ cy + dy ), the theoretical value of the number of data when the wind turbine blade is faulty is calculated to be ( cy + dy )×(b+ dy )/(a+b+ cy + dy ), and the theoretical value of the total number of data of the wind turbine blade is cy + dy ; 设置自由度为1;Set the degrees of freedom to 1; 根据卡方值计算公式计算卡方值,所述卡方值用来衡量各个所述实际值与各个所述理论值的差异程度,The chi-square value is calculated according to the chi-square value calculation formula, and the chi-square value is used to measure the difference between each of the actual values and each of the theoretical values. 所述卡方值计算公式为:The chi-square value calculation formula is: χ2=∑(A-T)2/Tχ 2 = ∑(AT) 2 /T 其中,χ2为卡方值,A为各个所述实际值,T为各个所述理论值;Wherein, χ 2 is the chi-square value, A is each of the actual values, and T is each of the theoretical values; 根据自由度和卡方值表,找到对应的P值,所述P值为犯第一类弃真错误的概率;According to the table of degrees of freedom and chi-square values, find the corresponding P value, which is the probability of making a first type of true rejection error; 选择P值小于0.05时对应的所述第一组SCADA数据中的某一个参数作为与风电叶片运行状态有关的参数,以依次筛选出与风电叶片运行状态有关的各个参数。A parameter in the first group of SCADA data corresponding to a P value less than 0.05 is selected as a parameter related to the operating state of the wind turbine blade, so as to sequentially screen out various parameters related to the operating state of the wind turbine blade. 7.根据权利要求1所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,所述采用DBSCAN聚类算法对所述风速-功率散点图进行数据清洗,具体包括:7. The wind turbine blade fracture early warning method based on association analysis of SCADA data according to claim 1 is characterized in that the DBSCAN clustering algorithm is used to clean the wind speed-power scatter plot, specifically comprising: 将所述风速-功率散点图中的所有点作为样本数据S,并将所述样本数据S中的每个数据点标记为未处理状态;Taking all points in the wind speed-power scatter plot as sample data S, and marking each data point in the sample data S as an unprocessed state; 对Eps和Minpts赋初始值,其中,Eps为所述样本数据S中的某一数据点p的邻域距离阈值,Minpts为所述样本数据S中的某一数据点p的半径为Eps的邻域中数据点个数的最小个数;Assign initial values to Eps and Minpts, where Eps is the neighborhood distance threshold of a data point p in the sample data S, and Minpts is the minimum number of data points in the neighborhood of a data point p in the sample data S with a radius of Eps; 将所述某一数据点p的半径为Eps的邻域设置为NEps(p);The neighborhood of the data point p with a radius of Eps is set to N Eps (p); 对Eps和Minpts形成的高密度区域进行聚类;Cluster the high-density areas formed by Eps and Minpts; 所述对Eps和Minpts形成的高密度区域进行聚类,具体包括:The clustering of the high-density areas formed by Eps and Minpts specifically includes: 判断所述某个数据点p是否已经加入某个簇或者已经被列为噪声;Determine whether the data point p has been added to a cluster or has been classified as noise; 如果所述某个数据点p已经加入某个簇或者已经被列为噪声,则分类结束;If the data point p has been added to a cluster or has been classified as noise, the classification ends; 如果所述某个数据点p没有加入某个簇且没有被列为噪声,则判断NEps(p)内是否至少有Minpts个对象;If the data point p is not added to a cluster and is not classified as noise, then determine whether there are at least Minpts objects in N Eps (p); 如果NEps(p)内至少有Minpts个对象,则构建新的类簇U,并在U中添加所述某个数据点p;If there are at least Minpts objects in N Eps (p), a new cluster U is constructed and the data point p is added to U; 如果NEps(p)内的对象个数少于Minpts,则将所述某个数据点p列为边界点或噪声。If the number of objects in N Eps (p) is less than Minpts, the data point p is listed as a boundary point or noise. 8.根据权利要求7所述的基于SCADA数据的关联分析的风电叶片断裂预警方法,其特征在于,所述Eps设置为4.5,以及所述Minpts设置为18.5。8. The wind turbine blade fracture early warning method based on correlation analysis of SCADA data according to claim 7, characterized in that the Eps is set to 4.5, and the Minpts is set to 18.5. 9.一种计算机装置,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至8中任一项所述的基于SCADA数据的关联分析的风电叶片断裂预警方法的步骤。9. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the wind turbine blade fracture early warning method based on correlation analysis of SCADA data as described in any one of claims 1 to 8 when executing the computer program. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至8中任一项所述的基于SCADA数据的关联分析的风电叶片断裂预警方法的步骤。10. A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the steps of the wind turbine blade fracture early warning method based on correlation analysis of SCADA data as described in any one of claims 1 to 8 are implemented.
CN202311540515.6A 2023-11-17 2023-11-17 Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium Pending CN117851877A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311540515.6A CN117851877A (en) 2023-11-17 2023-11-17 Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311540515.6A CN117851877A (en) 2023-11-17 2023-11-17 Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium

Publications (1)

Publication Number Publication Date
CN117851877A true CN117851877A (en) 2024-04-09

Family

ID=90527923

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311540515.6A Pending CN117851877A (en) 2023-11-17 2023-11-17 Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium

Country Status (1)

Country Link
CN (1) CN117851877A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118855647A (en) * 2024-09-24 2024-10-29 山东特检科技有限公司 A wind power equipment multi-parameter monitoring method and system based on AI

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118855647A (en) * 2024-09-24 2024-10-29 山东特检科技有限公司 A wind power equipment multi-parameter monitoring method and system based on AI
CN118855647B (en) * 2024-09-24 2024-12-03 山东特检科技有限公司 AI-based multi-parameter monitoring method and system for wind power equipment

Similar Documents

Publication Publication Date Title
CN113255848B (en) Identification method of hydraulic turbine cavitation acoustic signal based on big data learning
CN111597651B (en) A method for evaluating performance degradation of rolling bearings based on HWPSO-SVDD model
CN106779200A (en) Based on the Wind turbines trend prediction method for carrying out similarity in the historical data
CN103323772B (en) Based on the running status of wind generator analytical approach of neural network model
CN108376298A (en) A kind of Wind turbines generator-temperature detection fault pre-alarming diagnostic method
CN109492790A (en) Wind turbines health control method based on neural network and data mining
CN116085212B (en) Method and system for monitoring running state of new energy wind turbine generator in real time
CN114215706A (en) Wind turbine generator blade cracking fault early warning method and device
CN108953071A (en) A kind of fault early warning method and system of paddle change system of wind turbines
CN117391499A (en) Photovoltaic power station reliability evaluation method and device
CN118760013A (en) Intelligent building equipment monitoring and early warning system and method
CN117851877A (en) Wind power blade fracture early warning method and device based on SCADA data association analysis and readable storage medium
CN115238573A (en) Method and system for predicting performance degradation trend of hydroelectric unit considering working parameters
CN114577470A (en) Fault diagnosis method and system for fan main bearing
CN115906437A (en) Fan state determination method, device, equipment and storage medium
CN114282338A (en) Method for identifying operating state of wind turbine and wind turbine
CN119004303A (en) Power plant equipment fault diagnosis method and system based on artificial intelligence and automation
EP3303835B1 (en) Method for windmill farm monitoring
CN113404651B (en) Data anomaly detection method and device for wind generating set
CN117310480A (en) Unit fault diagnosis system of power plant
Yu et al. Wind Power Data Cleaning Based on Autoencoder-Isolation Forest
CN115013254A (en) Fault early warning method for wind driven generator
Ou et al. Fault prediction model of wind power pitch system based on BP neural network
CN112017793A (en) Molecular pump maintenance decision management system and method for fusion device
Shi et al. Wind Turbine Condition Monitoring Based on Variable Importance of Random Forest

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination