CN105069476A - Method for identifying abnormal wind power data based on two-stage integration learning - Google Patents

Method for identifying abnormal wind power data based on two-stage integration learning Download PDF

Info

Publication number
CN105069476A
CN105069476A CN201510484365.0A CN201510484365A CN105069476A CN 105069476 A CN105069476 A CN 105069476A CN 201510484365 A CN201510484365 A CN 201510484365A CN 105069476 A CN105069476 A CN 105069476A
Authority
CN
China
Prior art keywords
wind
electricity generation
powered electricity
model
random forest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510484365.0A
Other languages
Chinese (zh)
Other versions
CN105069476B (en
Inventor
耿天翔
丁茂生
李峰
葛俊
胡伟
郑乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
State Grid Corp of China SGCC
State Grid Ningxia Electric Power Co Ltd
Original Assignee
Tsinghua University
State Grid Corp of China SGCC
State Grid Ningxia Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, State Grid Corp of China SGCC, State Grid Ningxia Electric Power Co Ltd filed Critical Tsinghua University
Priority to CN201510484365.0A priority Critical patent/CN105069476B/en
Publication of CN105069476A publication Critical patent/CN105069476A/en
Application granted granted Critical
Publication of CN105069476B publication Critical patent/CN105069476B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Wind Motors (AREA)

Abstract

The present invention discloses a method for identifying abnormal wind power data based on two-stage integration learning. The method comprises the following steps of: S1: extracting abnormal wind power data parameters; S2: generating a training sample and a testing sample according to the abnormal wind power data parameters; S3: training the training sample by using a random forest to obtain a random forest model; S4: according to the random forest model, training the training sample by using a gradient iteration decision-making tree to obtain a gradient iteration decision-making tree model; and S5: according to the random forest model and the gradient iteration decision-making tree model, predicting the testing sample to obtain a prediction result. The method has the following advantage: the identification accuracy of the abnormal wind power data is improved.

Description

Based on the wind-powered electricity generation disorder data recognition method of two benches integrated study
Technical field
The invention belongs to wind-powered electricity generation field, be specifically related to a kind of wind-powered electricity generation disorder data recognition method based on two benches integrated study.
Background technology
Along with the extensive development of wind-power electricity generation, wind-powered electricity generation by a small scale, complementarity power supply is to role transforming that is extensive, importance power supply.A series of researchs such as wind power prediction, the wind electricity digestion etc. of wind-powered electricity generation all need high-quality wind-powered electricity generation data, need new technical method and means badly, analyze wind-powered electricity generation data characteristics, the identification of research wind-powered electricity generation abnormal data and reason, improving the wind-powered electricity generation quality of data is that follow-up study lays the first stone.Wind power system have accumulated a large amount of actual measurement and simulation calculation data, but bottom data quality is general not high, therefore, data digging method can be adopted to find wind-powered electricity generation abnormal data rule, and then to raw data pre-service, thus improve Raw data quality.Modal data digging method is cluster and classification, for the accuracy rate how improving the identification of wind-powered electricity generation data exception, how to select and to combine suitable method and model is an insoluble problem.
Summary of the invention
The present invention is intended at least one of solve the problems of the technologies described above.
For this reason, the object of the invention is to a kind of wind-powered electricity generation disorder data recognition method based on two benches integrated study.
To achieve these goals, the embodiment of a first aspect of the present invention discloses a kind of wind-powered electricity generation disorder data recognition method based on two benches integrated study, comprises the following steps: S1: extract wind-powered electricity generation abnormal data parameter; S2: generate training sample and test sample book according to described wind-powered electricity generation abnormal data parameter; S3: utilize random forest to train described training sample to obtain Random Forest model: S4: according to described Random Forest model, utilizes Gradient Iteration decision tree to train described training sample to obtain Gradient Iteration decision-tree model; And S5: predict that described test sample book is predicted the outcome respectively according to described Random Forest model and described Gradient Iteration decision-tree model.
According to the wind-powered electricity generation disorder data recognition method based on two benches integrated study of the embodiment of the present invention, improve the accuracy rate of wind-powered electricity generation disorder data recognition.
In addition, the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to the above embodiment of the present invention, can also have following additional technical characteristic:
Further, described wind-powered electricity generation anomaly parameter comprises: the fractile statistics of wind speed, wind power, wind speed and wind power peel off coefficient and the sample point of speed, sample point over time.
Further, described step S2 comprises further: divide described training sample and described test sample book by the time interval in wind-powered electricity generation exception history record.
Further, described step S3 comprises further: S301: use original mark value to train described training sample; S302: the positive and negative ratio and the parameter model that regulate described training sample, obtain described Random Forest model.
Further, described step S4 comprises further: use the output of described Random Forest model as the desired value of described training sample, utilizes Gradient Iteration decision tree train described training sample and regulate model parameter, obtains described Gradient Iteration decision-tree model.
Further, described step S5 comprises further: S501: predict that described test sample book obtains the first prediction intermediate value according to described Random Forest model, predicts that described test sample book obtains the second prediction intermediate value according to described Gradient Iteration decision-tree model; S502: to described first prediction intermediate value and described second prediction intermediate value average obtain described in predict the outcome.
Additional aspect of the present invention and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
Above-mentioned and/or additional aspect of the present invention and advantage will become obvious and easy understand from accompanying drawing below combining to the description of embodiment, wherein:
Fig. 1 is the FB(flow block) of the training Gradient Iteration decision-tree model of one embodiment of the invention;
Fig. 2 is the schematic flow sheet being obtained test result by test sample book of one embodiment of the invention.
Embodiment
Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
In describing the invention, it will be appreciated that, term " " center ", " longitudinal direction ", " transverse direction ", " on ", D score, " front ", " afterwards ", " left side ", " right side ", " vertically ", " level ", " top ", " end ", " interior ", orientation or the position relationship of the instruction such as " outward " are based on orientation shown in the drawings or position relationship, only the present invention for convenience of description and simplified characterization, instead of indicate or imply that the device of indication or element must have specific orientation, with specific azimuth configuration and operation, therefore limitation of the present invention can not be interpreted as.In addition, term " first ", " second " only for describing object, and can not be interpreted as instruction or hint relative importance.
In describing the invention, it should be noted that, unless otherwise clearly defined and limited, term " installation ", " being connected ", " connection " should be interpreted broadly, and such as, can be fixedly connected with, also can be removably connect, or connect integratedly; Can be mechanical connection, also can be electrical connection; Can be directly be connected, also indirectly can be connected by intermediary, can be the connection of two element internals.For the ordinary skill in the art, concrete condition above-mentioned term concrete meaning in the present invention can be understood.
With reference to description below and accompanying drawing, these and other aspects of embodiments of the invention will be known.Describe at these and in accompanying drawing, specifically disclose some particular implementation in embodiments of the invention, representing some modes of the principle implementing embodiments of the invention, but should be appreciated that the scope of embodiments of the invention is not limited.On the contrary, embodiments of the invention comprise fall into attached claims spirit and intension within the scope of all changes, amendment and equivalent.
Below in conjunction with accompanying drawing, the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to the embodiment of the present invention is described.
The output of first stage model, as the input of subordinate phase model, makes the sample point of first stage misclassification be corrected in subordinate phase, thus improves the accuracy rate of model entirety.That is, solve according to following steps:
Step (1): extract wind-powered electricity generation abnormal data correlated characteristic.
Proper vector is wind speed, wind power, the wind speed and wind power coefficient that peels off (LOF) of speed and sample point and the fractile statistics of sample point over time normally.
Step (2): generate training sample and test sample book.
Training sample and test sample book is divided according to the time interval by known historical record.Sample be input as the eigenvector information extracted in step (1).Output is data exception whether mark: as 1 represents normal, 0 represents abnormal.
Step (3): utilize RF (random forest) training sample data.
Use original mark value y training sample data, regulate positive and negative ratio and the model parameter of training sample, obtain optimum RF model, model exports as y – y rF.
Step (4): utilize GBDT (Gradient Iteration decision tree) training sample data.
Use the output y – y of step (3) rFas sample object value, recycling GBDT training sample data, regulate model parameter, obtain optimum GBDT model.
Step (5): utilize RF, GBDT two kinds of models to predict test sample book respectively.
It is y that the RF model prediction test sample book using step (3) to obtain obtains test result rF, the test result that the GBDT model prediction test sample book using step (4) to obtain obtains is y gBDT, final gained predicts the outcome as both mean value, i.e. y predict=(y rF+ y gBDT)/2.
In addition, other formation of the wind-powered electricity generation disorder data recognition method based on two benches integrated study of the embodiment of the present invention and effect are all known for a person skilled in the art, in order to reduce redundancy, do not repeat.
In the description of this instructions, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means to describe in conjunction with this embodiment or example are contained at least one embodiment of the present invention or example.In this manual, identical embodiment or example are not necessarily referred to the schematic representation of above-mentioned term.And the specific features of description, structure, material or feature can combine in an appropriate manner in any one or more embodiment or example.
Although illustrate and describe embodiments of the invention, those having ordinary skill in the art will appreciate that: can carry out multiple change, amendment, replacement and modification to these embodiments when not departing from principle of the present invention and aim, scope of the present invention is by claim and equivalency thereof.

Claims (6)

1., based on a wind-powered electricity generation disorder data recognition method for two benches integrated study, it is characterized in that, comprise the following steps:
S1: extract wind-powered electricity generation abnormal data parameter;
S2: generate training sample and test sample book according to described wind-powered electricity generation abnormal data parameter;
S3: utilize random forest to train described training sample to obtain Random Forest model:
S4: according to described Random Forest model, utilizes Gradient Iteration decision tree to train described training sample to obtain Gradient Iteration decision-tree model; And
S5: predict that described test sample book is predicted the outcome respectively according to described Random Forest model and described Gradient Iteration decision-tree model.
2. the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to claim 1, it is characterized in that, described wind-powered electricity generation anomaly parameter comprises: the fractile statistics of wind speed, wind power, wind speed and wind power peel off coefficient and the sample point of speed, sample point over time.
3. the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to claim 2, it is characterized in that, described step S2 comprises further:
Described training sample and described test sample book is divided by the time interval in wind-powered electricity generation exception history record.
4. the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to claim 3, it is characterized in that, described step S3 comprises further:
S301: use original mark value to train described training sample;
S302: the positive and negative ratio and the parameter model that regulate described training sample, obtain described Random Forest model.
5. the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to claim 4, it is characterized in that, described step S4 comprises further:
Use the output of described Random Forest model as the desired value of described training sample, utilize Gradient Iteration decision tree train described training sample and regulate model parameter, obtain described Gradient Iteration decision-tree model.
6. the wind-powered electricity generation disorder data recognition method based on two benches integrated study according to claim 5, it is characterized in that, described step S5 comprises further:
According to described Gradient Iteration decision-tree model, S501: predict that described test sample book obtains the first prediction intermediate value according to described Random Forest model, predicts that described test sample book obtains the second prediction intermediate value;
S502: to described first prediction intermediate value and described second prediction intermediate value average obtain described in predict the outcome.
CN201510484365.0A 2015-08-10 2015-08-10 Wind-powered electricity generation disorder data recognition method based on two stages integrated study Expired - Fee Related CN105069476B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510484365.0A CN105069476B (en) 2015-08-10 2015-08-10 Wind-powered electricity generation disorder data recognition method based on two stages integrated study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510484365.0A CN105069476B (en) 2015-08-10 2015-08-10 Wind-powered electricity generation disorder data recognition method based on two stages integrated study

Publications (2)

Publication Number Publication Date
CN105069476A true CN105069476A (en) 2015-11-18
CN105069476B CN105069476B (en) 2018-12-11

Family

ID=54498837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510484365.0A Expired - Fee Related CN105069476B (en) 2015-08-10 2015-08-10 Wind-powered electricity generation disorder data recognition method based on two stages integrated study

Country Status (1)

Country Link
CN (1) CN105069476B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909933A (en) * 2017-01-18 2017-06-30 南京邮电大学 A kind of stealing classification Forecasting Methodology of three stages various visual angles Fusion Features
WO2017129030A1 (en) * 2016-01-29 2017-08-03 阿里巴巴集团控股有限公司 Disk failure prediction method and apparatus
CN107179503A (en) * 2017-04-21 2017-09-19 美林数据技术股份有限公司 The method of Wind turbines intelligent fault diagnosis early warning based on random forest
CN107392304A (en) * 2017-08-04 2017-11-24 中国电力科学研究院 A kind of Wind turbines disorder data recognition method and device
CN109386435A (en) * 2017-08-04 2019-02-26 阿里巴巴集团控股有限公司 Wind turbine failure monitoring method, device and system
CN109426902A (en) * 2017-08-25 2019-03-05 北京国双科技有限公司 Enterprise Integrated evaluating method and device
CN109840312A (en) * 2019-01-22 2019-06-04 新奥数能科技有限公司 A kind of rejecting outliers method and apparatus of boiler load factor-efficiency curve
CN110378739A (en) * 2019-07-23 2019-10-25 中国联合网络通信集团有限公司 A kind of data traffic matching process and device
CN110685857A (en) * 2019-10-16 2020-01-14 湘潭大学 Mountain wind turbine generator behavior prediction model based on ensemble learning
CN111343127A (en) * 2018-12-18 2020-06-26 北京数安鑫云信息技术有限公司 Method, device, medium and equipment for improving crawler recognition recall rate
CN114519225A (en) * 2022-01-20 2022-05-20 南水北调中线干线工程建设管理局 Robust underdrain structure stress anomaly identification method and system
CN116664096A (en) * 2022-08-23 2023-08-29 国家电投集团科学技术研究院有限公司 Wind power bolt data processing method and device based on federal learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070260563A1 (en) * 2006-04-17 2007-11-08 International Business Machines Corporation Method to continuously diagnose and model changes of real-valued streaming variables
CN103207893A (en) * 2013-03-13 2013-07-17 北京工业大学 Classification method of two types of texts on basis of vector group mapping

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070260563A1 (en) * 2006-04-17 2007-11-08 International Business Machines Corporation Method to continuously diagnose and model changes of real-valued streaming variables
CN103207893A (en) * 2013-03-13 2013-07-17 北京工业大学 Classification method of two types of texts on basis of vector group mapping

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ALVARO ALONSO等: "Random Forests and Gradient Boosting for Wind Energy Prediction", 《INTERNATIONAL CONFERENCE ON HYBRID ARTIFICIAL INTELLIGENCE SYSTEMS》 *
赵永宁等: "风电场弃风异常数据簇的特征及处理方法", 《电力系统自动化》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017129030A1 (en) * 2016-01-29 2017-08-03 阿里巴巴集团控股有限公司 Disk failure prediction method and apparatus
CN106909933B (en) * 2017-01-18 2018-05-18 南京邮电大学 A kind of stealing classification Forecasting Methodology of three stages various visual angles Fusion Features
CN106909933A (en) * 2017-01-18 2017-06-30 南京邮电大学 A kind of stealing classification Forecasting Methodology of three stages various visual angles Fusion Features
CN107179503B (en) * 2017-04-21 2020-07-07 美林数据技术股份有限公司 Wind turbine generator fault intelligent diagnosis and early warning method based on random forest
CN107179503A (en) * 2017-04-21 2017-09-19 美林数据技术股份有限公司 The method of Wind turbines intelligent fault diagnosis early warning based on random forest
CN107392304A (en) * 2017-08-04 2017-11-24 中国电力科学研究院 A kind of Wind turbines disorder data recognition method and device
CN109386435A (en) * 2017-08-04 2019-02-26 阿里巴巴集团控股有限公司 Wind turbine failure monitoring method, device and system
CN109426902A (en) * 2017-08-25 2019-03-05 北京国双科技有限公司 Enterprise Integrated evaluating method and device
CN111343127B (en) * 2018-12-18 2021-03-16 北京数安鑫云信息技术有限公司 Method, device, medium and equipment for improving crawler recognition recall rate
CN111343127A (en) * 2018-12-18 2020-06-26 北京数安鑫云信息技术有限公司 Method, device, medium and equipment for improving crawler recognition recall rate
CN109840312A (en) * 2019-01-22 2019-06-04 新奥数能科技有限公司 A kind of rejecting outliers method and apparatus of boiler load factor-efficiency curve
CN109840312B (en) * 2019-01-22 2022-11-29 新奥数能科技有限公司 Abnormal value detection method and device for boiler load rate-energy efficiency curve
CN110378739A (en) * 2019-07-23 2019-10-25 中国联合网络通信集团有限公司 A kind of data traffic matching process and device
CN110378739B (en) * 2019-07-23 2022-03-29 中国联合网络通信集团有限公司 Data traffic matching method and device
CN110685857A (en) * 2019-10-16 2020-01-14 湘潭大学 Mountain wind turbine generator behavior prediction model based on ensemble learning
CN114519225A (en) * 2022-01-20 2022-05-20 南水北调中线干线工程建设管理局 Robust underdrain structure stress anomaly identification method and system
CN116664096A (en) * 2022-08-23 2023-08-29 国家电投集团科学技术研究院有限公司 Wind power bolt data processing method and device based on federal learning
CN116664096B (en) * 2022-08-23 2024-02-13 国家电投集团科学技术研究院有限公司 Wind power bolt data processing method and device based on federal learning

Also Published As

Publication number Publication date
CN105069476B (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN105069476A (en) Method for identifying abnormal wind power data based on two-stage integration learning
CN108869173B (en) Power control method and equipment for wind turbine generator
CN103400052B (en) Combined method for predicting short-term wind speed in wind power plant
US9664176B2 (en) Method of automatically calculating power curve limit for power curve monitoring of wind turbine
CN103488869A (en) Wind power generation short-term load forecast method of least squares support vector machine
CN103597417B (en) state monitoring method and device
CN108876163B (en) Transient state power angle stability rapid evaluation method integrating causal analysis and machine learning
CN112508442B (en) Transient stability assessment method and system based on automatic and interpretable machine learning
CN104732296A (en) Modeling method for distributed photovoltaic output power short-term prediction model
CN104252649A (en) Regional wind power output prediction method based on correlation between multiple wind power plants
KR101660102B1 (en) Apparatus for water demand forecasting
EP2541052B1 (en) Controlling a wind turbine using a neural network function
CN116227538B (en) Clustering and deep learning-based low-current ground fault line selection method and equipment
CN108334988A (en) A kind of short-term Load Forecasting based on SVM
CN106682159A (en) Threshold configuration method
CN103699947A (en) Meta learning-based combined prediction method for time-varying nonlinear load of electrical power system
CN112598148A (en) Fan variable pitch motor temperature fault early warning method based on collaborative expression and LightGBM algorithm
CN116505556B (en) Wind farm power control system and method based on primary frequency modulation
CN105809286A (en) Incremental SVR load prediction method based on representative data reconstruction
Liu et al. Dynamic security assessment of western Danish power system based on ensemble decision trees
Khosravi et al. Wind farm power uncertainty quantification using a mean-variance estimation method
Tidriri et al. Data-driven decision-making methodology for prognostic and health management of wind turbines
CN104537437A (en) Power equipment state maintaining predicting method based on genetic algorithm
Wu et al. Multi-step wind power forecast based on similar segments extracted by mathematical morphology
CN106527138B (en) Photovoltaic inverter direct current side resistance parameter fluctuation coefficient prediction method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20181211

Termination date: 20190810

CF01 Termination of patent right due to non-payment of annual fee