CN110795690A

CN110795690A - Wind power plant operation abnormal data detection method

Info

Publication number: CN110795690A
Application number: CN201911018724.8A
Authority: CN
Inventors: 陈子新; 刘永前; 王金山; 丛智慧; 李硕; 马亮; 韩爽; 阎洁; 李莉; 张�浩
Original assignee: DATANG (CHIFENG) NEW ENERGY Co Ltd
Current assignee: DATANG (CHIFENG) NEW ENERGY Co Ltd
Priority date: 2019-10-24
Filing date: 2019-10-24
Publication date: 2020-02-14

Abstract

A method for detecting abnormal operation data of a wind power plant is provided, which belongs to the technical field of wind power and comprises the following steps: collecting operating data, operating environment parameters and wind turbine generator parameters of a wind turbine generator; performing data preliminary processing according to the power characteristics of the wind turbine generator; removing discrete abnormal data points of the wind turbine generator operation by adopting a quartile method; introducing pitch angle, wind direction, temperature and humidity data at corresponding moments into wind speed and power operation data of an original unit to form multidimensional data, sorting the multidimensional data according to the wind speed, and dividing the multidimensional data into an inter-cell data set at a wind speed interval of a first threshold; normalizing the data set of each wind speed cell; and taking the normalized data in each wind speed interval as the input of a multi-dimensional data clustering model, and detecting abnormal data in the multi-dimensional data clustering model. The method solves the technical problem of low accuracy rate of identifying abnormal data.

Description

Wind power plant operation abnormal data detection method

Technical Field

The invention relates to the technical field of wind power, in particular to a method for detecting abnormal operation data of a wind turbine generator.

Background

The wind power curve is one of the main curves describing the corresponding relation between wind speed and power output and reflecting the performance of the wind turbine. The equivalent power curve of the wind turbine can be used for evaluating the performance and the operation condition of the wind turbine and the wind power plant, and in addition, the data are important for wind power prediction and directly influence the final precision of a prediction result, so that the formulation of a power system scheduling plan is influenced. The equivalent power curve of the wind turbine generator set is determined to have high requirements on wind speed and power quality, but due to sensor precision, electromagnetic interference, information processing errors, storage or communication faults and wind power station wind abandoning and electricity limiting, a large amount of abnormal data exist in original recorded data, and the integral distribution rule and the corresponding relation of the wind speed and the power are seriously damaged. If the data are directly used without being processed, the analysis of the operation rule of the wind turbine generator is influenced to destroy the operation rule of the wind turbine generator, and further, the performance analysis of the wind turbine generator and the prediction precision of the power of the wind turbine generator are influenced.

The identification and cleaning of the operating power data of the wind turbine generator becomes a necessary step in the research of the wind turbine generator, and the cleaning of the abnormal operating data of the existing wind turbine generator can be divided into three types in principle: identifying anomalous data based on the density or distance of the data points; establishing a mathematical model of a wind power curve to identify abnormal data; and identifying abnormal data according to the position distribution characteristics of the abnormal data. Research on the existing method shows that abnormal data are commonly identified according to data point density or distance in practical application, and the method mainly comprises a quartile method, a clustering method, an intra-group optimal variance method, a 3 sigma method and the like. The method is simple to operate and good in effect when identifying the abnormal data of the wind turbine generator set, but the method is limited in effect when identifying the situations that the normal data and the abnormal data of the wind turbine generator set are equivalent, namely the intensive accumulation type abnormal data.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a method for detecting abnormal operation data of a wind power plant, which comprises the following steps:

step 1: collecting operating data, operating environment parameters and wind turbine generator parameters of a wind turbine generator;

step 2: performing data primary processing on the power and wind speed data according to the power characteristics of the wind turbine generator;

and step 3: identifying and processing the wind speed and power abnormal data after primary processing by adopting a quartile method, and clearing discrete abnormal data points of the wind turbine generator;

and 4, step 4: after discrete abnormal data points of the wind turbine generator during operation are eliminated, introducing pitch angle, wind direction, temperature and humidity data at corresponding moments into wind speed and power operation data of an original wind turbine generator to form multidimensional data, sequencing the multidimensional data according to the wind speed, and dividing the multidimensional data into a data set between cells at a wind speed interval of a first threshold;

and 5: normalizing the data set of each wind speed cell;

step 6: and taking the normalized data in each wind speed interval as the input of a multi-dimensional data clustering model, and detecting abnormal data in the multi-dimensional data clustering model.

According to one aspect of the invention, the wind turbine operating data in step 1 includes wind speed, wind direction, power and pitch angle, the operating environment parameters include temperature and humidity, and the wind turbine parameters include cut-in wind speed, cut-out wind speed and turbine rated power.

According to an aspect of the invention, the step 2 comprises:

step 21: clearing data that the output power of the wind turbine is larger than zero when the wind speed is smaller than the cut-in wind speed; and

step 22: and clearing data of which the power is less than or equal to zero when the wind speed is greater than the cut-in wind speed and less than the cut-out wind speed.

According to an aspect of the invention, said step 3 comprises:

step 31: clearing wind speed abnormal data by a quartile method, comprising the following steps: sorting the data from large power to small power, dividing the data into cells by a power interval unit of a second threshold value, and removing abnormal values of wind speed in the cells by a quartile method;

step 32: clearing power anomaly data by a quartile method, comprising: sorting the data processed in the step 31 according to the wind speed, dividing the data into cells at a wind speed interval of a third threshold value, and removing the abnormal power value in the cells by a quartile method.

According to an aspect of the invention, the first threshold, the second threshold and the third threshold are respectively: 0.5m/s, 25KW and 0.5 m/s.

According to an aspect of the present invention, the normalization in step 5 is performed by:

wherein X_iFor the data set of the ith wind speed cell,

the data is normalized; x is the original data of the selected variable in the interval; x_minIs the minimum value of the selected variable within the interval; x_maxThe maximum value of the variable selected within the interval.

According to one aspect of the invention, said step 6 comprises:

step 61: given a wind speed inter-cell dataset X containing n samples_i＝{x₁,x₂,…,x_nD-dimensional real feature vectors are taken as each sample;

step 62: constructing a similarity matrix S epsilon R according to input sample data^n×nWherein the similarity matrix S_ij＝W_ij，W_ijIs a contiguous matrix representing the ithA weight vector between the sample and the jth sample;

where σ is a parameter controlling the propagation of the domain, where the element S_ij＝S_ji；

And step 63: according to the adjacency matrix W, constructing a degree matrix D:

wherein:

step 64: calculating a Laplace matrix:

L＝D-W

constructing a standardized laplacian matrix:

L＝D^-1/2LD^-1/2

step 65: calculating all eigenvalues and eigenvectors of a Laplace matrix L, wherein L has n real eigenvalues greater than or equal to 0, i.e. 0 ═ λ₁≤λ₂≤…≤λ_nThe corresponding feature vector is α₁,α₂,…,α_n；

And step 66: calculating eigenvalue interval value { C₁,C₂,…,C_n-2Get C out_imax＝max{C₁,C₂,…,C_n-2Let p be argC_imax；

Step 67, the first k eigenvectors are selected and k is n-p-1, i.e., f is α₁,α₂,…,α_n-p-1}；

Step 68: standardizing a matrix formed by the selected characteristic vectors F according to rows to finally form an n multiplied by k dimensional characteristic matrix F;

step 68: clustering the selected K-dimensional matrix F, wherein each row of data is a data sample, the clustering number is K, and the clustering algorithm is K-means to obtain cluster division { U₁,U₂,…,U_kThe cluster center is { w }₁,w₂,…,w_k}。

Step 69: clustering the data in each wind speed interval by the same steps to find out each { w₁,w₂,…,w_kCenter of maximum cluster w_maxCorresponding U_maxThe data in the rest data clusters are abnormal data and are cleared.

And 6, the clustering method in the multi-dimensional data clustering model uses spectral clustering. And by calculating the distance between characteristic values of the Laplace matrix in the spectral clustering process, the defect that the clustering category number needs to be input in advance is solved in an optimized mode, so that the value of a clustering parameter k can be automatically determined in the clustering process, the clustering accuracy is improved, and the operation is simplified.

Compared with the prior art, the invention has the following effects:

1. the principle is reasonable, and the identification efficiency is high. Due to wind abandoning and electricity limiting or communication faults, a large amount of accumulation type abnormal data often appear in the operation data of the wind turbine generator in certain specific wind speed intervals, the normal data and the abnormal data are equivalent in quantity, and the existing research is not ideal for identifying the data. Aiming at the defect, after the dispersion abnormal data are removed by using a quartile method, environmental factors and other operation data are introduced into a wind speed interval, so that more information of the abnormal data is provided, data identification dimensionality is increased, a wind power plant operation abnormal data identification model based on multidimensional clustering is established, and the accumulation type abnormal data identification effect is improved. Therefore, the method can not only clear the conventional scattered abnormal data points, but also effectively identify the abnormal data clusters.

2. The process is simple and the universality is good. The method does not depend on training and learning by using a normal data set, has higher automatic processing capability and stronger universality, and can meet the data processing requirements of different types and different data volume abnormal values of the wind turbine generator and the wind power plant. In addition, aiming at the defect that the input parameter k value in each wind turbine generator set clustering method cannot be determined in advance in the existing method, the invention adopts a multidimensional spectral clustering method for automatically determining the clustering parameter k value, thereby simplifying the operation steps.

Drawings

FIG. 1 is a block diagram of the overall steps of a wind farm abnormal operation data detection method based on multidimensional clustering according to an embodiment.

FIG. 2 is a detailed flowchart of the multidimensional clustering model in the step of identifying abnormal data of the wind turbine generator according to one embodiment.

Detailed Description

The technical solution of the present invention will be described in further detail with reference to the accompanying fig. 1-2 and the specific embodiments.

As shown in FIG. 1, the invention discloses a method for detecting abnormal operation data of a wind power plant.

Step 1: the method comprises the steps of collecting operation data, operation environment parameters and wind turbine generator parameters of the wind turbine generator. The wind turbine generator system operating data comprises wind speed, wind direction, power and pitch angle, the operating environment parameters comprise temperature and humidity, and the wind turbine generator system parameters comprise cut-in wind speed, cut-out wind speed and set rated power.

Step 2: and performing data preliminary processing on the power and wind speed data according to the power characteristics of the wind turbine generator.

And step 3: and identifying and processing the wind speed and power abnormal data after preliminary processing by adopting a quartile method, and clearing discrete abnormal data points of the wind turbine generator. In order to remove the conventional scattered abnormal data points of the unit operation, the abnormal data is firstly identified and processed by the wind speed and power data which are primarily processed by adopting a quartile method.

And 4, step 4: after discrete abnormal data points of the wind turbine generator during operation are removed, pitch angle, wind direction, temperature and humidity data at corresponding moments are introduced into wind speed and power operation data of an original wind turbine generator to form multidimensional data, the multidimensional data are sorted according to the wind speed, and are divided into data sets among cells according to the wind speed interval of a first threshold value.

According to one embodiment, as shown in FIG. 1, the data sets between cells are divided into wind speed intervals with a first threshold of 0.5m/s, and the resulting data sets are as follows:

wherein X_iThe data set of the ith wind speed interval is shown as v, wind speed, d, wind direction, p, power, α, pitch angle, t and h, the temperature and humidity of the environment at the corresponding moment, m is the number of wind speed intervals, and n is the data volume of the ith wind speed interval.

And 5: and carrying out normalization processing on the data set of each wind speed cell.

According to one embodiment, the normalization process is performed by:

wherein

Step 6: and taking the normalized data in each wind speed interval as the input of a multi-dimensional data clustering model, and detecting abnormal data in the multi-dimensional data clustering model. Since the multidimensional clustering method uses dimension reduction processing in processing multidimensional clustering, it is good for the conventional clustering method in processing high-dimensional clustering complexity, and according to one embodiment, the clustering method in the present application uses a spectral clustering model.

The preliminary treatment in the step 2 specifically includes:

step 21: and clearing data that the output power of the group is larger than zero when the wind speed is smaller than the cut-in wind speed. When the wind speed is lower than the cut-in wind speed, the output power of the unit is less than or equal to zero, so the data with the power greater than zero are abnormal data, and the data are often directly cleared due to communication faults. In order to not affect the overall data characteristics of the v-p data and facilitate subsequent processing of abnormal data, the subsequent processing can not consider the data.

Step 22: and clearing data of which the power is less than or equal to zero when the wind speed is greater than the cut-in wind speed and less than the cut-out wind speed. And when the wind speed is greater than the cut-in wind speed and less than the cut-out wind speed, the data with the power less than or equal to zero is regarded as abnormal data of the unit shutdown, and the abnormal data are directly cleared.

The step 3 specifically comprises the following steps:

step 31: and clearing wind speed abnormal data by a quartile method. The method specifically comprises the following steps: the data are sorted from high power to low power, divided into small regions by a second threshold value, for example, a power interval unit of 25KW, and an abnormal value of the wind speed in the region is removed by a quartile method. When the power is higher than the rated power, the wind speed data are concentrated and less, normal data are mainly distributed on the right side, namely a wind speed large area, and therefore the identification of abnormal values does not set an upper limit.

Wherein Q₃Wind speed quartile 3, Q₁Is the 1 st quartile, p_rThe rated power of the wind turbine generator is obtained. I is_QRFor a quartile range, the following is calculated:

I_QR＝Q₃-Q₁

step 32: and clearing the power abnormal data by a quartile method. The method specifically comprises the following steps: the data processed in step 31 are sorted according to wind speed, the data are divided into small sections at intervals of a third threshold value, for example, the wind speed is 0.5m/s, and abnormal power values in the sections are removed by a quartile method. Because abnormal data are distributed more under the power curve, the amount of normal data and abnormal data is possibly equivalent, and the normal data are prevented from being deleted by mistake, so that the upper limit is not set when the abnormal value is identified.

P＜Q₃-1.5I_QR,v＞v_in

The step 6 specifically comprises the following steps:

step 61: given a wind speed inter-cell dataset X containing n samples_i＝{x₁,x₂,…,x_nEach sample is a d-dimensional real feature vector, where d is 5.

Step 62: constructing a similarity matrix S epsilon R according to input sample data^n×nWherein the similarity matrix S_ij＝W_ij，W_ijIs a adjacency matrix, representing the weight vector between the ith sample and the jth sample.

Where σ is a parameter controlling the propagation of the domain, where the element S_ij＝S_ji。

wherein:

step 64: calculating a Laplace matrix:

L＝D-W

constructing a standardized laplacian matrix:

L＝D^-1/2LD^-1/2

step 65: calculating all eigenvalues and eigenvectors of a Laplace matrix L, wherein L has n real eigenvalues greater than or equal to 0, i.e. 0 ═ λ₁≤λ₂≤…≤λ_nThe corresponding feature vector is α₁,α₂,…,α_n. And in the calculation of the later step, the characteristic value 0 and the corresponding characteristic vector are ignored, and no practical significance is realized in the calculation.

And step 66: calculating eigenvalue interval value { C₁,C₂,…,C_n-2Get C out_imax＝max{C₁,C₂,…,C_n-2Let p be argC_imaxThe symbol p of this formula is not power, and represents a numerical value.

Step 67, the first k eigenvectors are selected and k is n-p-1, i.e., f is α₁,α₂,…,α_n-p-1P in this formula is not power, and represents the value in step 66.

Step 68: and normalizing the matrix formed by the selected feature vectors F according to rows to finally form the feature matrix F with the dimension of n multiplied by k.

According to the invention, environmental factors and other unit operation parameters are introduced into the existing wind speed and power data, model modeling dimensions are increased, an abnormal data identification model based on a multi-dimensional clustering method is further established, a clustering parameter k value is automatically determined by optimizing a clustering algorithm, the operation is simplified, and the accumulation type abnormal data identification effect is improved. The method considers the influence of various factors on the power of the wind turbine generator, and combines a quartile method to establish a multidimensional clustering model to identify abnormal operation data of the wind turbine generator, but the traditional clustering method has low data dimension and low accuracy rate of identifying abnormal data. The abnormal data model based on multi-dimensional clustering identification provided by the invention can automatically determine the category number, and the vast majority of clustering methods have the defect that the k value of the clustering parameter in each wind turbine generator set needs to be input in advance. Therefore, the method is simple to operate and high in calculation efficiency in practice.

The embodiments set forth in the foregoing description are exemplary only, and modifications may be made therein by those skilled in the art without departing from the spirit of the present application, and such modifications are intended to be within the scope of the present application.

Claims

1. A wind power plant operation abnormal data detection method is characterized by comprising the following steps:

and 5: normalizing the data set of each wind speed cell;

2. The method of claim 1, wherein:

the wind turbine generator operation data in the step 1 comprise wind speed, wind direction, power and pitch angle, the operation environment parameters comprise temperature and humidity, and the wind turbine generator parameters comprise cut-in wind speed, cut-out wind speed and set rated power.

3. The method of claim 1, wherein:

the step 2 comprises the following steps:

4. The method of claim 1, wherein:

the step 3 comprises the following steps:

5. The method of claim 4, wherein:

the first threshold, the second threshold and the third threshold are respectively: 0.5m/s, 25KW and 0.5 m/s.

6. The method of claim 1, wherein:

the normalization processing mode in the step 5 is as follows:

wherein X_iFor the data set of the ith wind speed cell,

7. The method of claim 1, wherein:

the step 6 comprises the following steps:

step 62: constructing a similarity matrix S epsilon R according to input sample data^n×nWherein the similarity matrix S_ij＝W_ij，W_ijRepresenting a weight vector between the ith sample and the jth sample for the adjacency matrix;

wherein:

step 64: calculating a Laplace matrix:

L＝D-W

constructing a standardized laplacian matrix:

L＝D^-1/2LD^-1/2

step 65: calculating all eigenvalues and eigenvalues of the Laplace matrix LVector calculating all eigenvalues and eigenvectors of matrix L, where L has n real eigenvalues greater than or equal to 0, i.e. 0 ═ λ₁≤λ₂≤…≤λ_nThe corresponding feature vector is α₁,α₂,…,α_n；