CN108412710B - A kind of Wind turbines wind power data cleaning method - Google Patents

A kind of Wind turbines wind power data cleaning method Download PDF

Info

Publication number
CN108412710B
CN108412710B CN201810091183.0A CN201810091183A CN108412710B CN 108412710 B CN108412710 B CN 108412710B CN 201810091183 A CN201810091183 A CN 201810091183A CN 108412710 B CN108412710 B CN 108412710B
Authority
CN
China
Prior art keywords
data
power
wind
abnormal
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810091183.0A
Other languages
Chinese (zh)
Other versions
CN108412710A (en
Inventor
沈小军
付雪姣
周冲成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201810091183.0A priority Critical patent/CN108412710B/en
Publication of CN108412710A publication Critical patent/CN108412710A/en
Application granted granted Critical
Publication of CN108412710B publication Critical patent/CN108412710B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F03MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR
    • F03DWIND MOTORS
    • F03D80/00Details, components or accessories not provided for in groups F03D1/00 - F03D17/00
    • F03D80/50Maintenance or repair
    • F03D80/55Cleaning
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F03MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR
    • F03DWIND MOTORS
    • F03D17/00Monitoring or testing of wind motors, e.g. diagnostics
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F03MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR
    • F03DWIND MOTORS
    • F03D9/00Adaptations of wind motors for special use; Combinations of wind motors with apparatus driven thereby; Wind motors specially adapted for installation in particular locations
    • F03D9/20Wind motors characterised by the driven apparatus
    • F03D9/25Wind motors characterised by the driven apparatus the apparatus being an electrical generator
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E10/00Energy generation through renewable energy sources
    • Y02E10/70Wind energy
    • Y02E10/72Wind turbines with rotation axis in wind direction

Abstract

The present invention relates to a kind of Wind turbines wind power data cleaning methods, and this method comprises the following steps: (1) wind power data to be cleaned being divided into several data intervals according to wind speed size;(2) height grouping-quartile method is respectively adopted for each data interval and carries out data cleansing, rejecting abnormalities data;(3) data interval after cleaning is combined to the wind power data after being cleaned.Compared with prior art, data cleansing effect of the present invention is good, high-efficient, and has stronger versatility.

Description

A kind of Wind turbines wind power data cleaning method
Technical field
The present invention relates to wind-power electricity generation big data technical fields, clean more particularly, to a kind of Wind turbines wind power data Method.
Background technique
Wind-power electricity generation is a kind of cleaning, the reproducible energy, is quickly becoming the important of sustainable development and energy strategy Component part.But during wind-power electricity generation the wind speed and direction that changes at random make wind power have fluctuation, intermittence and The features such as randomness adversely affect the stability and reliability of Operation of Electric Systems.Eliminate the one of these adverse effects Kind important means is exactly to improve the predictable of wind-power electricity generation by the excavation of running of wind generating set data.By actual measurement wind speed and The wind power curve that power obtains can be used for assessing the performance and operation conditions of Wind turbines, to judging that fan trouble has important valence Value, while timing power data is also to study wind power prediction and assess wind power to the basis of electric network influencing.Therefore, quasi- The wind speed and power data of Wind turbines actual motion are really obtained, can be the economic security operation of wind power plant and optimal control plan Basic data supporting is slightly provided.But in wind power plant operational process, due to compressor emergency shutdown, off-load, communication noise and equipment The factors such as failure can generate a large amount of abnormal datas.The collection, management, analysis of running of wind generating set data and method for digging at present There are still many deficiencies, the quality difference for collecting data can not accurately be recognized, and then effectively supports the correct sieve of asperity data Choosing and rationalization optimization, cause the quality of data that cannot ensure.If these data are directly used without processing, obtained wind-force Power generation statistical property can be distorted, and will affect the operating status of Wind turbines and the analysis result of operation characteristic.In order to improve The quality of data, data cleansing have become link indispensable in data mining process.
Traditional data cleaning method often fails when encountering bulk deposition type abnormal data, and data cleansing is data The basis of excavation directly affects the reliability of subsequent analysis and application.
Summary of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide a kind of Wind turbines wind function Rate data cleaning method.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of Wind turbines wind power data cleaning method, this method comprises the following steps:
(1) wind power data to be cleaned is divided by several data intervals according to wind speed size;
(2) height grouping-quartile method is respectively adopted for each data interval and carries out data cleansing, rejecting abnormalities number According to;
(3) data interval after cleaning is combined to the wind power data after being cleaned.
Step (2) specifically:
(21) wind speed-power data curve bottom accumulation type abnormal data, middle part accumulation type is identified using the height method of dividision into groups Distributed abnormal data around abnormal data and partial trace, and above-mentioned abnormal data elimination is obtained into intermediate data set;
(22) to the data in intermediate data set using quartile method identification curve top accumulation type abnormal data and residue Curve around distributed abnormal data, and then delete above-mentioned abnormal data to obtain normal data set.
Step (21) specifically:
(211) wind speed-power data is arranged to obtain power descending data set according to power descending, seeks power descending number According to each data point variance change rate of concentration;
(212) the height position of variance change rate is obtained;
(213) data in power descending data set before height position are determined as intermediate data set, the number behind height position According to for abnormal data and rejecting.
Step (22) specifically:
(221) intermediate data is sought using quartile method and concentrates limit in the exceptional value of data;
(222) intermediate data is concentrated the data being located in exceptional value in limit range be determined as normal data, other are different Regular data is simultaneously rejected.
Variance change rate is obtained especially by such as under type in step (211):
(211a) obtains power descending data set W:
W={ (v1,p1),(v2,p2),…,(vn,pn),
Wherein, viIndicate the wind speed of i-th of data point, piIndicate that the power of i-th of data point, i=1,2 ... ... n, n are Wind speed-power number strong point total number in data interval, and work as i=1, when 2 ... ... n-1, pi> pi+1
(211b) obtains the variance s of i-th of data pointi:
Wherein, pjFor the power of j-th of data point,Indicate that the 1st data point is averaged to the power of i-th of data point Value;
(211c) obtains variance change rate:
K (i)=| si-si-1|, i=2,3 ... n.
Step (212) to variance change rate carry out height identify to obtain height position specific method include Bayes method, Least square method, maximum-likelihood method, local comparison method and wavelet analysis method.
Step (221) specifically:
Intermediate data is concentrated data wind speed-power data to arrange to obtain among power ascending order according to power ascending order by (221a) Data set;
(221b) is sought using quartile method for power ascending order intermediate data set to be divided equally into tetrameric be in The magnitude of power of three cut-point positions, is denoted as Q respectively1, Q2And Q3
(221c) obtains interquartile-range IQR IQR:
IQR=Q3-Q1
(221d), which is obtained, limits [F in exceptional valuel,Fu], FlFor lower limit value, FuFor upper limit value:
[Fl,Fu]=[Q1-1.5IQR,Q3+1.5IQR]。
Q in step (221b)1, Q2And Q3It is obtained especially by such as under type:
(b1) for power ascending order intermediate data set: X={ x1,x2,…,xm, calculate median Q2:
(b2) Q is calculated1And Q3:
If m=2k, k are natural number, then from Q2Place divides data sample X for two parts, Q2Be not included in two parts data it It is interior, median Q ' is sought using (b1) method to two parts data respectively2With Q "2, Q '2< Q "2, then: Q1=Q '2, Q3=Q "2
If m=4k+1, k are natural number, then:
If m=4k+3, k are natural number, then:
Compared with prior art, the present invention has the advantage that
(1) height grouping-quartile method proposed by the present invention combines the height method of dividision into groups and quartering, and process is reasonable, Cleaning effect is good, high-efficient, and has stronger versatility;
(2) the height method of dividision into groups of the present invention realizes that bottom accumulation type abnormal data, middle part accumulation type abnormal data and part are bent The cleaning of distributed abnormal data around line, quartile method is realized disperses around top accumulation type abnormal data and remaining curve The cleaning of type abnormal data, therefore the two resulting structure can be realized reliable effective cleaning of four kinds of abnormal datas, avoid biography The problem of data cleaning method of system can fail when encountering bulk deposition type abnormal data, cleaning effect is good.
Detailed description of the invention
Fig. 1 is the distribution map of all kinds of abnormal datas of the present invention;
Fig. 2 is the flow diagram of Wind turbines wind power data cleaning method of the present invention;
Fig. 3 be the present embodiment in 8.5m/s to 9m/s wind speed section power data variance rate of change curve chart;
Fig. 4 is the height method of dividision into groups in the present embodiment to the recognition effect of 8.5m/s abnormal data into 9m/s wind speed segment data Figure;
Fig. 5 is remaining abnormal data cleaning effect figure in Fig. 4;
Fig. 6 is the present embodiment wind power data figure to be cleaned;
Fig. 7 is that the present embodiment carries out the effect picture after data scrubbing using the method for the present invention;
Fig. 8 is that the present embodiment carries out the effect picture after data scrubbing using tradition LOF algorithm.
Specific embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.Note that the following embodiments and the accompanying drawings is said Bright is substantial illustration, and the present invention is not intended to be applicable in it object or its purposes is defined, and the present invention does not limit In the following embodiments and the accompanying drawings.
Embodiment
Wind power curve is the relation curve described between wind speed and unit output power.It is not only design Wind turbines The important evidence of control system, or the important indicator of examination Wind turbines power generation performance and wind power plant operation conditions.From wind-powered electricity generation A large amount of exceptional data points are generally comprised in the collected data in field.The reason of leading to data exception, has very much, including outside the plan stops Machine, abandonment ration the power supply, air velocity transducer failure, communication equipment fault, electromagnetic interference, blower off-grid, extreme weather conditions, blower leaf Piece dirt or the factors such as impaired.The abnormal data that different reasons generate, the distribution characteristics in wind power curve be not also identical.It presses According to distribution characteristics of the data point in wind power curve, abnormal data can be divided into four classes, including curve bottom, middle part, top heap Distributed abnormal data around product type abnormal data and curve.The distribution of all kinds of abnormal datas is as shown in Figure 1.Specifically:
(1) curve bottom accumulation type abnormal data
Curve bottom accumulation type abnormal data shows as a lateral density data band (in Fig. 1 1. in wind power curve Signified data).Such abnormal data Producing reason includes unit failure, communication equipment or measuring terminals failure, outside the plan stops Situations such as machine overhauling.In these cases, the theoretical output power of wind power generating set is zero.If fan blade does not rotate, And the TT&C system needs of blower are driven by electricity, then may also occur the case where wind power is negative value in data.Therefore, curve Bottom abnormal data can fluctuate near power zero, show as data stacking.
(2) accumulation type abnormal data in the middle part of curve
Accumulation type abnormal data shows as one or more positioned at probability power curve in wind power curve in the middle part of curve Lateral density data band (2. signified data in Fig. 1) except lower bound.Such abnormal data Producing reason be abandonment ration the power supply or Communication failure.Abandonment, which is rationed the power supply, refers to the output power of limitation Wind turbines, makes it below the control measure of normal output.In wind-powered electricity generation In the actual moving process of field, since the peak-frequency regulation ability and ability to transmit electricity of current electric system are insufficient, abandonment is forced to have become For normality, this allows for having a large amount of abnormal data in original recorded data.In actual operation, due to many data sets The information of abandonment measure, therefore one of the source that abandonment is rationed the power supply as abnormal data are not recorded.In this case, wind speed Data record is actual change situation, and the output power of Wind turbines maintains a lower horizontal holding not for a long time Become, even if wind speed is more than rated wind speed, output power is also lower than full state for a long time, is limited to some value.Therefore, in curve The state that portion's abnormal data can be maintained at a below full hair in larger wind speed range remains unchanged, and shows as data stacking.
(3) curve top accumulation type abnormal data
Curve top accumulation type abnormal data shows as one or more in wind power curve and is located at probability power curve Lateral density data band (3. signified data in Fig. 1) except the upper bound.Such abnormal data Producing reason is usually to communicate mistake Mistake or air velocity transducer failure.Air velocity transducer is the important instrument for monitoring wind speed, is mounted on generating set cabin tail portion, it is difficult to Often it cleans and safeguards, often failure or catching phenomenon, the air speed data of measurement is caused not meet actual conditions.Curve top Accumulation type abnormal data can keep power constant in low wind speed scheduling slot, show as data stacking.
(4) distributed abnormal data around curve
It is lower random to show as density near power curve in wind power curve for distributed abnormal data around curve Restrain scatterplot (4. signified data in Fig. 1).Such abnormal data is by signal borne noise, sensor failure, extreme weather conditions Caused by equal Random Effects factor.Abnormal data caused by enchancement factor can the random fluctuation near normal value, degree of fluctuation Size is also random.Therefore, distributed abnormal data random dispersion can divide except probability power curve boundary around curve Cloth.
As shown in Fig. 2, a kind of Wind turbines wind power data cleaning method, this method comprises the following steps:
(1) wind power data to be cleaned is divided by several data intervals according to wind speed size;
(2) height grouping-quartile method is respectively adopted for each data interval and carries out data cleansing, rejecting abnormalities number According to;
(3) data interval after cleaning is combined to the wind power data after being cleaned.
Step (2) specifically:
(21) wind speed-power data curve bottom accumulation type abnormal data, middle part accumulation type is identified using the height method of dividision into groups Distributed abnormal data around abnormal data and partial trace, and above-mentioned abnormal data elimination is obtained into intermediate data set;
(22) to the data in intermediate data set using quartile method identification curve top accumulation type abnormal data and residue Curve around distributed abnormal data, and then delete above-mentioned abnormal data to obtain normal data set.
Step (21) specifically:
(211) wind speed-power data is arranged to obtain power descending data set according to power descending, seeks power descending number According to each data point variance change rate of concentration;
(212) the height position of variance change rate is obtained;
(213) data in power descending data set before height position are determined as intermediate data set, the number behind height position According to for abnormal data and rejecting.
Variance change rate is obtained especially by such as under type in step (211):
(211a) obtains power descending data set W:
W={ (v1,p1),(v2,p2),…,(vn,pn), (1)
Wherein, viIndicate the wind speed of i-th of data point, piIndicate that the power of i-th of data point, i=1,2 ... ... n, n are Wind speed-power number strong point total number in data interval, and work as i=1, when 2 ... ... n-1, pi> pi+1
(211b) obtains the variance s of i-th of data pointi:
Wherein, pjFor the power of j-th of data point,Indicate that the 1st data point is averaged to the power of i-th of data point Value;
(211c) obtains variance change rate:
K (i)=| si-si-1|, i=2,3 ... n.(3)
Step (212) to variance change rate carry out height identify to obtain height position specific method include Bayes method, Least square method, maximum-likelihood method, local comparison method and wavelet analysis method, the present embodiment change variance using least square method Rate k (i) carries out height identification, specifically:
Equipped with independent variable x1,…,xrIt is the function of variable i with dependent variable k, is denoted as xq(i), q=1 ..., r and k (i). xq(i) be i completely known nonrandom function, k (i) is stochastic variable, is divided into two sections of front and back, two sections of one linear moulds of each obedience Type, regression coefficient mutate at i=j:
Coefficient column vector in formulaWithIt is unequal, then j For regression change-points.Since k (i) is continuous, Prescribed Properties:
β1' x (i)=β2' x (i), (5)
According to the principle of least square method, the weighted target function of this data sample model are as follows:
Wherein, every power wiIt is inversely proportional with the error variance of sample k (i).
The minimum in the following formula (7) of constraint (6) is sought, to determine the estimation of height j:
And then step (213) is after obtaining height position j, power descending data set W can be divided into normal data and different Regular data two parts, it may be assumed that
In formula, WnFor the normal data set in power descending data set W, WoFor the abnormal data in power descending data set W Collection.
Variance is shown for choosing 8.5m/s to 9m/s wind speed segment data in the actual measurement operation data of certain Wind turbines Change rate height is grouped the course of work.Initial data statistics finds that the wind speed interval shares 1837 data points.According to performance number Descending arrangement is carried out to data, the variance s of 1837 groups of data powers is successively calculated according to formula (2)i, then find out the variance of each point Change rate ki, as shown in Figure 3.
From the figure 3, it may be seen that there are significant changes in variance change rate between the 1500th~1600 point.Pass through minimum two The identification of multiplication height can be found out, and the height j=1562 of sequence, i.e., in the 1562nd point, curve model is mutated.Variance The mutation of change rate must be due to that, there are caused by the high abnormal data of a large amount of dispersion degrees, can sentence in initial data Data after fixed 1562nd point are abnormal data.Normal data and abnormal data are separated according to height j, obtain normal number According to collection WnWith abnormal data Wo.Fig. 4 is recognition effect of the height method of dividision into groups to 8.5m/s abnormal data into 9m/s wind speed segment data Figure.As shown in Figure 4, the first and second class accumulation type abnormal data below wind speed-power curve has been recognized accurately in the height method of dividision into groups With the distributed abnormal data in part, but the third class accumulation type abnormal data and the 4th class of part to the wind power curve upper bound point Dissipating type abnormal data can not effectively identify.Quartile and its up-and-down boundary can measure the overall distribution situation of data, reflection The center of data distribution and scattered band theoretically can effectively identify residue using the exceptional value point of cut-off in quartile method Abnormal data.
By normal data set WnStep (22) data can be carried out as intermediate data set, and curve is identified using quartile method Distributed abnormal data around top accumulation type abnormal data and remaining curve, and then above-mentioned abnormal data is deleted to obtain just Regular data collection.
Step (22) specifically:
(221) intermediate data is sought using quartile method and concentrates limit in the exceptional value of data;
(222) intermediate data is concentrated the data being located in exceptional value in limit range be determined as normal data, other are different Regular data is simultaneously rejected.
Step (221) specifically:
Intermediate data is concentrated data wind speed-power data to arrange to obtain among power ascending order according to power ascending order by (221a) Data set;
(221b) is sought using quartile method for power ascending order intermediate data set to be divided equally into tetrameric be in The magnitude of power of three cut-point positions, is denoted as Q respectively1, Q2And Q3
(221c) obtains interquartile-range IQR IQR:
IQR=Q3-Q1
(221d), which is obtained, limits [F in exceptional valuel,Fu], FlFor lower limit value, FuFor upper limit value:
[Fl,Fu]=[Q1-1.5IQR,Q3+1.5IQR]。
Q in step (221b)1, Q2And Q3It is obtained especially by such as under type:
(b1) for power ascending order intermediate data set: X={ x1,x2,…,xm, calculate median Q2:
(b2) Q is calculated1And Q3:
If m=2k, k are natural number, then from Q2Place divides data sample X for two parts, Q2Be not included in two parts data it It is interior, median Q ' is sought using (b1) method to two parts data respectively2With Q "2, Q '2< Q "2, then: Q1=Q '2, Q3=Q "2
If m=4k+1, k are natural number, then:
If m=4k+3, k are natural number, then:
Quartile method is to abnormal data cleaning effect remaining in Fig. 4 as shown in figure 5, the height method of dividision into groups unrecognized The distributed abnormal data of three classes accumulation type abnormal data and part realizes effectively identification cleaning by quartile method, it is seen then that joint The height method of dividision into groups and quartile method can preferably identify four quasi-representative abnormal datas.
In order to verify the validity of proposed data cleaning method and process, the operation of certain domestic wind power plant Wind turbines is chosen Data carry out case verification, and the basic parameter of the wind power plant Wind turbines is as follows: 2 000kW of rated power, rotor diameter 95.9m cuts wind speed 3m/s, rated wind speed 11m/s, cut-out wind speed (10min average value) 25m/s.Here abnormal data is chosen The operation data than more typical No. 7 units is distributed to illustrate unified algorithm to the cleaning effect of data.No. 7 units continuous 12 The initial data of the moon is as shown in Figure 6.It will be appreciated from fig. 6 that No. 7 units contain all types abnormal data.
The effect picture cleaned using the method for the present invention is as shown in fig. 7, visible all types abnormal data can in figure Be effectively recognized, and be judged as the part of normal data in figure close to the wind power curve of perfect condition, illustrate set forth herein Height grouping-quartile method to the identification of Wind turbines wind speed-power misoperation data cleaning be feasible, and clean effect Fruit is not influenced by accumulation type abnormal data.Height grouping-quartile method is had recorded in table 1 to the data deletion rate of No. 7 units And cleaning efficiency.For the method to the deletion rate of abnormal data 20% or so, this is related with the number of abnormal data.Unit 12 For the operation data cleaning time-consuming of the moon in 40s or so, cleaning efficiency is higher.
In order to illustrate the reasonability and validity of this data cleansing process of height grouping-quartile method, No. 7 units are utilized Raw operational data from the dimensions comparative analyses such as cleaning effect, cleaning efficiency and data deletion rate side proposed by the present invention Method and traditional part peel off the factor (LOF) data cleaning method.Table 1 is from two angles of data deletion rate and cleaning efficiency to two Kind method comparing result.Wherein, cleaning efficiency is to obtain under identical running environment, therefore have centainly comparable Property.
1 algorithms of different data cleansing effect of table
Cleaning method Original data volume Remaining data amount Data deletion rate Scavenging period
Height grouping-quartile method 52528 41475 21.04% 39.59s
Locally peel off factor algorithm 52528 41475 21.04% 15min 27s
The effect for cleaning No. 7 data unit operations using LOF algorithm is as shown in Figure 8.LOF algorithm is a kind of typical cluster Algorithm judges whether the point is abnormal point, the neighbour selected in the present embodiment by comparing the density of each point and its neighborhood point The number of domain point is 20.In LOF algorithm, the needs of ratio shared by abnormal data are set in advance, for the ease of being grouped-four with height Period in arithmetric is compared, and 21.04% part preceding in all data points highest point of the factor that peels off is determined as exceptional value, is allowed to It is identical as height grouping-abnormal data amount of quartile method identification.Comparison diagram 7 and Fig. 8 can be seen that when two methods identification When abnormal data amount is identical, LOF algorithm is bad to the recognition effect of accumulation type abnormal data.
Above embodiment is only to enumerate, and does not indicate limiting the scope of the invention.These embodiments can also be with other Various modes are implemented, and can make in the range of not departing from technical thought of the invention it is various omit, displacement, change.

Claims (7)

1. a kind of Wind turbines wind power data cleaning method, which is characterized in that this method comprises the following steps:
(1) wind power data to be cleaned is divided by several data intervals according to wind speed size;
(2) height grouping-quartile method is respectively adopted for each data interval and carries out data cleansing, rejecting abnormalities data;
(3) data interval after cleaning is combined to the wind power data after being cleaned;
Step (2) specifically:
(21) abnormal using height method of dividision into groups identification wind speed-power data curve bottom accumulation type abnormal data, middle part accumulation type Distributed abnormal data around data and partial trace, and above-mentioned abnormal data elimination is obtained into intermediate data set;
(22) to the data in intermediate data set using quartile method identification curve top accumulation type abnormal data and remaining song Distributed abnormal data around line, and then delete above-mentioned abnormal data to obtain normal data set.
2. a kind of Wind turbines wind power data cleaning method according to claim 1, which is characterized in that step (21) tool Body are as follows:
(211) wind speed-power data is arranged to obtain power descending data set according to power descending, seeks power descending data set In each data point variance change rate;
(212) the height position of variance change rate is obtained;
(213) data in power descending data set before height position are determined as intermediate data set, and the data behind height position are Abnormal data is simultaneously rejected.
3. a kind of Wind turbines wind power data cleaning method according to claim 1, which is characterized in that step (22) tool Body are as follows:
(221) intermediate data is sought using quartile method and concentrates limit in the exceptional value of data;
(222) intermediate data is concentrated the data being located in exceptional value in limit range be determined as normal data, other are abnormal number According to and reject.
4. a kind of Wind turbines wind power data cleaning method according to claim 2, which is characterized in that step (211) Middle variance change rate is obtained especially by such as under type:
(211a) obtains power descending data set W:
W={ (v1,p1),(v2,p2),…,(vn,pn),
Wherein, viIndicate the wind speed of i-th of data point, piIndicate that the power of i-th of data point, i=1,2 ... ... n, n are data Wind speed-power number strong point total number in section, and work as i=1, when 2 ... ... n-1, pi> pi+1
(211b) obtains the variance s of i-th of data pointi:
Wherein, pj is the power of j-th of data point,Indicate the 1st data point to i-th of data point power average value;
(211c) obtains variance change rate:
K (i)=| si-si-1|, i=2,3 ... n.
5. a kind of Wind turbines wind power data cleaning method according to claim 2, which is characterized in that step (212) Variance change rate is carried out height to identify to obtain the specific method of height position including Bayes method, least square method, greatly seemingly Right method, local comparison method and wavelet analysis method.
6. a kind of Wind turbines wind power data cleaning method according to claim 3, which is characterized in that step (221) Specifically:
(221a) arranges intermediate data concentration data wind speed-power data to obtain power ascending order intermediate data according to power ascending order Collection;
(221b) seeks tetrameric being in three for power ascending order intermediate data set to be divided equally into using quartile method The magnitude of power of cut-point position, is denoted as Q respectively1, Q2And Q3
(221c) obtains interquartile-range IQR IQR:
IQR=Q3-Q1
(221d), which is obtained, limits [F in exceptional valuel,Fu], FlFor lower limit value, FuFor upper limit value:
[Fl,Fu]=[Q1-1.5IQR,Q3+1.5IQR]。
7. a kind of Wind turbines wind power data cleaning method according to claim 6, which is characterized in that step (221b) Middle Q1, Q2And Q3It is obtained especially by such as under type:
(b1) for power ascending order intermediate data set: X={ x1,x2,…,xm, calculate median Q2:
(b2) Q is calculated1And Q3:
If m=2k, k are natural number, then from Q2Place divides data sample X for two parts, Q2It is not included within two parts data, Median Q ' is sought using (b1) method to two parts data respectively2With Q "2, Q '2< Q "2, then: Q1=Q '2, Q3=Q "2
If m=4k+1, k are natural number, then:
If m=4k+3, k are natural number, then:
CN201810091183.0A 2018-01-30 2018-01-30 A kind of Wind turbines wind power data cleaning method Active CN108412710B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810091183.0A CN108412710B (en) 2018-01-30 2018-01-30 A kind of Wind turbines wind power data cleaning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810091183.0A CN108412710B (en) 2018-01-30 2018-01-30 A kind of Wind turbines wind power data cleaning method

Publications (2)

Publication Number Publication Date
CN108412710A CN108412710A (en) 2018-08-17
CN108412710B true CN108412710B (en) 2019-08-06

Family

ID=63126658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810091183.0A Active CN108412710B (en) 2018-01-30 2018-01-30 A kind of Wind turbines wind power data cleaning method

Country Status (1)

Country Link
CN (1) CN108412710B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408772A (en) * 2018-10-11 2019-03-01 四川长虹电器股份有限公司 To the restoration methods of the abnormal data in continuity data
CN109783486B (en) * 2019-01-17 2020-11-24 华北电力大学 Data cleaning method and device and server
CN109492048A (en) * 2019-01-21 2019-03-19 国网河北省电力有限公司经济技术研究院 A kind of extracting method, system and the terminal device of power consumer electrical characteristics
CN110094300B (en) * 2019-01-24 2020-04-21 上海电力学院 Wind deviation correction method, device, equipment and medium for wind turbine generator
CN109919199A (en) * 2019-02-13 2019-06-21 东南大学 The detection method of Wind turbines abnormal data based on image procossing
CN109918364B (en) * 2019-02-28 2020-10-27 华北电力大学 Data cleaning method based on two-dimensional probability density estimation and quartile method
CN110134919B (en) * 2019-04-30 2020-12-15 华北电力大学 Method for cleaning abnormal data of wind turbine generator
CN112696324B (en) * 2019-10-22 2022-08-23 北京金风科创风电设备有限公司 Wind power generator group data processing method, device and system
CN110795690A (en) * 2019-10-24 2020-02-14 大唐(赤峰)新能源有限公司 Wind power plant operation abnormal data detection method
CN110955650B (en) * 2019-11-20 2023-06-23 云南电网有限责任公司电力科学研究院 Method for cleaning out-of-tolerance data of digital hygrothermograph in standard laboratory
CN111291032A (en) * 2020-01-23 2020-06-16 福州大学 Combined wind power plant data cleaning method
CN111476402A (en) * 2020-03-16 2020-07-31 云南电网有限责任公司 Wind power generation capacity prediction method coupling meteorological information and EMD technology
CN111522808B (en) * 2020-04-29 2023-07-28 贵州电网有限责任公司 Abnormal operation data processing method for wind turbine generator
CN112032003B (en) * 2020-09-01 2021-08-17 浙江运达风电股份有限公司 Method for monitoring operation performance of large wind turbine generator
CN113236508B (en) * 2021-05-31 2022-04-05 浙江运达风电股份有限公司 Method for detecting wind speed-power abnormal data of wind generating set
CN113777351B (en) * 2021-08-26 2022-09-20 同济大学 Fault diagnosis method and device for wind speed sensor of wind power plant
CN114033631B (en) * 2021-11-08 2023-08-01 浙江运达风电股份有限公司 Online identification method for wind energy utilization coefficient of wind turbine generator
CN114091354B (en) * 2022-01-07 2022-05-17 国能日新科技股份有限公司 Method and device for acquiring wind turbine generator power prediction model sample set
CN114548843B (en) * 2022-04-25 2022-07-15 北京寄云鼎城科技有限公司 Method for processing power data of wind driven generator, computer equipment and medium
CN114969017B (en) * 2022-07-28 2022-11-11 深圳量云能源网络科技有限公司 Wind power data cleaning method, cleaning device and prediction method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6735550B1 (en) * 2001-01-16 2004-05-11 University Corporation For Atmospheric Research Feature classification for time series data
CN102545211B (en) * 2011-12-21 2013-11-06 西安交通大学 Universal data preprocessing device and method for wind power prediction
CN102663513B (en) * 2012-03-13 2016-04-20 华北电力大学 Utilize the wind power combined prediction modeling method of grey relational grade analysis
CN103291544B (en) * 2013-06-21 2016-01-13 华北电力大学 Digitizing Wind turbines power curve method for drafting
CN107067100B (en) * 2017-01-25 2020-12-04 国网冀北电力有限公司 Wind power abnormal data identification method and identification device

Also Published As

Publication number Publication date
CN108412710A (en) 2018-08-17

Similar Documents

Publication Publication Date Title
CN108412710B (en) A kind of Wind turbines wind power data cleaning method
Shen et al. A combined algorithm for cleaning abnormal data of wind turbine power curve based on change point grouping algorithm and quartile algorithm
CN111539553B (en) Wind turbine generator fault early warning method based on SVR algorithm and off-peak degree
CN102588210B (en) Filtering method for preprocessing fitting data of power curve
CN108443088B (en) Wind turbine generator system state judging method based on cumulative probability distribution
CN111260503B (en) Wind turbine generator power curve outlier detection method based on cluster center optimization
CN110362045B (en) Marine doubly-fed wind turbine generator fault discrimination method considering marine meteorological factors
CN106991508A (en) A kind of running of wind generating set state identification method based on DBSCAN
CN111878320B (en) Monitoring method and system of wind generating set and computer readable storage medium
CN114444382A (en) Wind turbine generator gearbox fault diagnosis and analysis method based on machine learning algorithm
CN110259648B (en) Fan blade fault diagnosis method based on optimized K-means clustering
CN111275570A (en) Wind turbine generator set power abnormal value detection method based on iterative statistics and hypothesis test
CN109626161A (en) A kind of Elevator Fault Diagnosis method
CN113236508B (en) Method for detecting wind speed-power abnormal data of wind generating set
CN110738232A (en) grid voltage out-of-limit cause diagnosis method based on data mining technology
CN110533314A (en) A kind of wind power plant exception unit recognition methods based on probability density distribution
CN112267972A (en) Intelligent judgment method for abnormity of power curve of wind turbine generator
CN108734359A (en) A kind of wind power prediction data preprocessing method
CN107895058B (en) A kind of method of quick identification wind speed Optimal Distribution rule
CN113626767A (en) Fan power limit identification method, system, equipment and storage medium
CN116771610A (en) Method for adjusting fault evaluation value of variable pitch system of wind turbine
CN111623905B (en) Wind turbine generator bearing temperature early warning method and device
CN112001511A (en) Equipment reliability and dynamic risk evaluation method, system and equipment based on data mining
CN116976191A (en) Method for predicting shafting stability degradation trend of hydroelectric generating set
CN106523300A (en) Wind turbine generator power loss evaluating method based on cabin anemograph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant