CN114154679A - Spark-based PCFOA-KELM wind power prediction method and device - Google Patents

Spark-based PCFOA-KELM wind power prediction method and device Download PDF

Info

Publication number
CN114154679A
CN114154679A CN202111232892.4A CN202111232892A CN114154679A CN 114154679 A CN114154679 A CN 114154679A CN 202111232892 A CN202111232892 A CN 202111232892A CN 114154679 A CN114154679 A CN 114154679A
Authority
CN
China
Prior art keywords
wind power
data
value
kelm
spark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111232892.4A
Other languages
Chinese (zh)
Other versions
CN114154679B (en
Inventor
经正俊
齐刚
宋坤
张世磊
马驰源
吉书强
周安
倪晓锋
王文贵
管超
甘露平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Nanzi Huadun Digital Technology Co ltd
Original Assignee
Nanjing Huadun Power Information Security Evaluation Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Huadun Power Information Security Evaluation Co Ltd filed Critical Nanjing Huadun Power Information Security Evaluation Co Ltd
Priority to CN202111232892.4A priority Critical patent/CN114154679B/en
Publication of CN114154679A publication Critical patent/CN114154679A/en
Application granted granted Critical
Publication of CN114154679B publication Critical patent/CN114154679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Quality & Reliability (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Operations Research (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a PCFOA-KELM wind power prediction method and device based on Spark. The method comprises the following steps: acquiring meteorological index data influencing the wind power of a single fan in a wind power plant; preprocessing the meteorological index data by using a Spark platform; inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and outputting a wind power predicted value of the single fan by the KELM model; and adding the wind power predicted values of all the fans in the wind power plant to obtain the total power predicted value of the wind power plant. The KELM model is based on a Spark platform, is trained by taking meteorological indexes affecting the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples, and is obtained by optimizing parameters of the KELM model by utilizing a PCFOA algorithm. The method can realize efficient and accurate prediction of the wind power, and simultaneously well improves the running speed.

Description

Spark-based PCFOA-KELM wind power prediction method and device
Technical Field
The invention relates to a Spark-based PCFOA (Parallel chaotic fruit Fly Optimization Algorithm) wind power prediction method and device, belonging to the field of wind power generation.
Background
The wind power generation is influenced by meteorological factors, and the output power of the wind power system has the characteristics of volatility, randomness, intermittence and the like along with different seasons and geographic positions. The method is used for predicting the wind power with high precision, is the basis for evaluating the operation state of the wind power plant, is an important basis for planning, designing and scheduling the operation of a power grid, and has important significance for guaranteeing the safe and stable operation of the power grid.
In the research of wind power prediction technology, many colleges, enterprises, research institutes and the like are going on, and currently proposed wind power prediction methods can be roughly divided into: physical methods and statistical methods.
The physical method is based on data such as numerical Weather forecast NWP (numerical Weather prediction) and terrain information, obtains data such as wind speed through a simulation model (such as a microscopic meteorological model and a CFD (computational fluid dynamics) model), and then obtains a prediction result by combining an actual power curve. The method is suitable for wind power plants with complex terrain, does not need a large amount of data, but is very complex in calculation, and the prediction accuracy is difficult to guarantee due to the fact that the NWP resolution ratio cannot easily meet the use requirement. The statistical method is to establish a prediction model based on historical data to predict by using a machine learning method, such as a neural network, a support vector machine, a time series method and the like. The method can be adaptive to the position of the wind power plant, adaptively reduce errors, but needs a large amount of historical data, and has the problems of low training and optimizing speed and the like.
With the arrival of the industrial 4.0 era, big data technology is increasingly applied to intelligent power plants. In view of the characteristics of large wind power prediction data volume, complex data types and the like, a learner puts forward to apply a big data technology to process a large-scale data set, reduce training and prediction time and improve prediction accuracy.
In the scheme of modeling based on a big data platform, the existing scheme at present is as follows: processing and analyzing wind power prediction input data through a Hadoop platform, and then modeling and predicting by combining building a combined model by using a BP neural network and an SVM (support vector machine); and establishing a parallel SVM model based on a Hadoop platform, and performing SVM optimization and power prediction through parallel calculation.
In the prior art, on one hand, the traditional physical method has high algorithm complexity and very complex calculation, the prediction precision basically depends on the meteorological data prediction condition, and the prediction precision needs to be improved; the statistical method needs to process a large amount of historical data, and has the defects that the data training optimization searching time is too long, the timeliness of prediction is difficult to meet, and the practical application is carried out. On the other hand, in an algorithm for performing wind power prediction by using a big data technology, the technical advantages of a big data platform cannot be fully utilized, and the big data correlation technology is only applied to data preprocessing or parallel model calculation. In addition, the traditional wind power prediction is performed on the whole power plant, and along with the extension of the wind power plant, the accuracy of prediction performed on the whole wind power plant modeling is not high enough due to the fact that meteorological conditions and geographic conditions of positions of all wind turbines are different.
Disclosure of Invention
The invention aims to provide a PCFOA-KELM wind power prediction method and device based on Spark, so as to solve the problem of how to quickly and accurately predict wind power and improve the operation speed while ensuring the prediction precision.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect, a Spark-based PCFOA-KELM wind power prediction method comprises the following steps:
acquiring meteorological index data influencing the wind power of a single fan in a wind power plant;
preprocessing the meteorological index data by using a Spark platform;
inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and outputting a wind power predicted value of the single fan by the KELM model;
adding the wind power predicted values of all the fans in the wind power plant to obtain a total power predicted value of the wind power plant;
the KELM model is trained by taking meteorological indexes affecting the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform, and parameter optimization is carried out on the KELM model by utilizing a PCFOA algorithm.
Further, the training method of the KELM model comprises the following steps:
acquiring meteorological index historical data and corresponding wind power historical data which affect the wind power of a single fan in a wind power plant;
preprocessing the acquired historical data by using a Spark platform;
establishing a feature vector according to the preprocessed data, and dividing the feature vector into a training sample and a test sample;
training the KELM model by adopting a training sample, and optimizing a regularization coefficient and a kernel parameter of the KELM model by utilizing a PCFOA algorithm to obtain an optimized KELM model;
and testing the optimized KELM model prediction effect by adopting a test sample.
Further, the preprocessing the historical data by using a Spark platform includes:
and storing the historical data into an RDD (remote data description) to generate an RDD data set, dividing the RDD data set into a plurality of subdata sets, respectively carrying out data preprocessing on each subdata set, and distributing tasks for carrying out data preprocessing on the plurality of subdata sets to a plurality of corresponding executors to carry out parallel operation.
Further, the pre-processing comprises: data abnormal value processing, data missing value processing, data noise reduction and normalization processing.
Further, the data outlier processing comprises:
for the condition of sporadic data loss, the average value of the data values before and after the loss point is used as the data of the loss point;
for the condition of a small amount of data missing, an interpolation method is adopted for processing, and a specific interpolation formula is shown as the following formula:
Figure BDA0003316543620000041
in the formula, Pn、Pn+iAnd Pn+jThe values of the index at the nth, n + i and n + j moments respectively;
for the case of a large amount of data missing, the data for that period of time is discarded.
Further, the data denoising comprises:
performing wavelet decomposition on the acquired index value by utilizing a Haar wavelet basis function;
based on the wavelet coefficient of the normal data signal being greater than the wavelet coefficient of the noise, separating the noise from the normal data according to the following equation based on the set threshold:
Figure BDA0003316543620000051
wherein lambda is wavelet coefficient threshold, omega is wavelet coefficient, omegaλFor de-noised wavelet systemsNumber, sgn is a sign function;
and reconstructing the data after the noise is removed.
Further, the meteorological indexes include wind speed, wind direction, air pressure, temperature and humidity, and the normalization process includes:
according to the maximum value and the minimum value, respectively processing the wind speed, the air pressure and the humidity according to the following formula:
Figure BDA0003316543620000052
wherein, value is the result after normalization, the value range is 0-1, and valuemax,valueminAnd valuetRespectively representing a maximum value, a minimum value and a current value in the index data set;
the temperature is treated according to the following formula:
Figure BDA0003316543620000053
wherein T is a normalized temperature value, TtIs the current temperature value;
wind direction is normalized to between 0 and 1 using a combination of sine and cosine values.
Further, the optimizing regularization coefficients and kernel parameters of the KELM model by using the PCFOA algorithm includes:
dividing a fruit fly group with the size of N into N independent parallel units through RDD of a Spark platform, calculating a taste concentration value according to a fitness function by each single fruit fly, and locally searching for an updated position from a current position, wherein the taste concentration value and the current position locally searching for the updated position are calculated in parallel.
Further, the optimizing regularization coefficients and kernel parameters of the KELM model by using the PCFOA algorithm includes:
s1, initializing the fruit fly population size N, the maximum iteration number max, the search step length l and the initial position (x)0,y0) And a chaotic parameter mu;
s2, dividing the fruit fly population into N independent parallel computing units;
s3, starting iterative optimization, and for each parallel computing unit, when carrying out the first iteration, the random distance of the drosophila individual for searching food by using smell is shown as the following formula:
Figure BDA0003316543620000061
after the first iteration is finished, the subsequent iteration uses the optimal solution information to obtain a new chaos step length l based on the two-dimensional Logistic chaos mapping chaotizationxAnd lyThe calculation formula is shown as follows:
Figure BDA0003316543620000062
Figure BDA0003316543620000063
where μ e (0,2.28), xn,yn∈(0,1);
S4, calculating the distance d from the current position of the individual fruit fly to the original pointiAnd taking the reciprocal of the value as the taste concentration judgment value s of the individual fruit fliesiThe calculation formula is shown as follows:
Figure BDA0003316543620000064
si=1di
s5, judging the taste concentration SiSubstituting the taste concentration into a taste concentration judgment function to calculate the taste concentration value of the fruit fly individual at the current position;
s6, finding out the fruit fly with the optimal taste concentration value in all fruit fly individuals, and recording the taste concentration value and the corresponding position value of the fruit fly;
s7, keeping the optimal taste concentration value S andits coordinate (x)index,yindex) And visually flying to the optimal position recorded in the step S6 to update the position, so as to form a new fruit fly population center:
Figure BDA0003316543620000071
sbest=s
s8, repeating steps S2-S6, judging whether the optimal taste concentration value is better than the historical optimal taste concentration value, if yes, executing step S7 if the current iteration number is less than max.
In another aspect, a Spark-based PCFOA-KELM wind power prediction device includes:
the acquisition module is configured to acquire meteorological index data influencing the wind power of a single fan in a wind power plant;
the data preprocessing module is configured to preprocess the meteorological index data by using a Spark platform;
the prediction module is configured to input the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and the KELM model outputs a wind power prediction value of the single fan;
the calculation module is configured to add the wind power predicted values of all the fans in the wind power plant to obtain a total power predicted value of the wind power plant;
the KELM model is trained by taking meteorological indexes affecting the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform, and parameter optimization is carried out on the KELM model by utilizing a PCFOA algorithm.
Compared with the prior art, the invention has the following beneficial technical effects:
the method fully utilizes the powerful capability of a Spark big data platform for analyzing and processing data in parallel, carries out high-efficiency preprocessing on the acquired data based on the Spark platform, designs a parallel parameter optimization method PCFOA, and improves the data preprocessing and optimization modeling speed; meanwhile, the characteristics that the generalization performance of the KELM is better than that of machine learning algorithms such as ELM and SVM, the calculation speed is higher when better or similar prediction accuracy is obtained, and the like are utilized, the PCFOA is utilized to carry out parameter optimization on the KELM based on a Spark big data platform, the prediction accuracy of the wind power is effectively improved, and the operation speed is well improved.
Drawings
Fig. 1 is a flow chart of a PCFOA-KELM wind power prediction method based on Spark according to an embodiment of the present invention.
Detailed Description
The invention is further described with reference to specific examples. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
As mentioned above, in the aspect of predicting the power of the wind power plant, the traditional physical method has high algorithm complexity and very complex calculation, and the prediction precision basically depends on the meteorological data prediction condition; the statistical method needs to process a large amount of historical data, and the data training optimization time is too long, so that the timeliness of prediction is difficult to meet; however, in the algorithm for wind power prediction by using the big data technology, the technical advantages of the big data platform cannot be fully utilized.
Therefore, the invention provides a parallel chaos drosophila optimization algorithm PCFOA optimization parallel kernel extreme learning machine KELM wind power algorithm to model and predict a single fan, and then the sum of predicted power of all fans is used as the total output of a wind power plant, so that the wind power prediction precision is improved; and the functional characteristics of the Spark platform are fully utilized, Spark is utilized to realize correlation in the aspects of data processing analysis and parallel model calculation, the prediction precision is effectively improved, and the running speed is well improved.
In one embodiment, a Spark-based PCFOA-KELM wind power prediction method, as shown in fig. 1, includes:
step S1, acquiring meteorological index historical data and corresponding wind power historical data which affect the wind power of a single fan in a wind power plant;
wherein, meteorological indexes such as wind speed V, wind direction D, air pressure P, temperature T and humidity H are used as factors influencing wind power.
Step S2, preprocessing the acquired historical data by using a Spark platform;
since some uncertain factors may cause data abnormality or missing in actual data measurement and acquisition, the index value data acquired in step S1 needs to be preprocessed.
In this embodiment, data preprocessing is performed based on the Spark platform.
Spark is a big data parallel computing framework based on memory computing. Spark is based on memory calculation and is established on a uniform and abstract RDD (resource Distributed data sets), so that the real-time performance of data processing in a big data environment is improved, high fault tolerance and high scalability are guaranteed, and a user is allowed to deploy Spark on a large amount of cheap hardware to form a cluster. The RDD is a fault-tolerant and parallel data structure, can enable a user to store data into a disk and a memory explicitly, can control the partition of the data, and has extremely high performance in analysis and processing of big data.
The method comprises the steps of storing original data into an RDD, decomposing the RDD into a plurality of logically identical tasks after triggering actions by using a MapReduce principle, and distributing the tasks to a plurality of executors to execute in parallel.
In this embodiment, the history data obtained in step S1 is stored in an RDD to generate an RDD data set, the RDD data set is divided into a plurality of sub data sets, each sub data set is respectively subjected to data preprocessing, and tasks of the plurality of sub data sets subjected to data preprocessing are allocated to a plurality of corresponding executors to perform parallel operations.
Wherein, the data preprocessing comprises the following steps: data abnormal value processing, data missing value processing, data noise reduction and normalization processing. The method specifically comprises the following steps:
(1) data outlier handling
And for each acquired index data, data preprocessing is carried out according to the reasonable range table 1 of the test values, and data which are not in the range are removed.
TABLE 1 reasonable parameter Range Table
Figure BDA0003316543620000101
(2) Data missing value handling
For the above index data missing condition, in order to ensure the continuity and authenticity of the data, the following conditions are respectively processed:
a) for the condition of sporadic data loss, the average value of data values before and after the loss point is used as the data of the loss point;
b) for the condition of a small amount of data missing, an interpolation method is adopted for processing, and a specific interpolation formula is shown as a formula (1):
Figure BDA0003316543620000111
in the formula, Pn、Pn+iAnd Pn+jThe values of the index at the nth, n + i and n + j moments respectively;
c) for the case of large data loss, such as wind turbine maintenance or scheduled power limit, the data in the period should be discarded as invalid data.
(3) Data de-noising processing
Due to the influence of factors such as environment, the acquired index data is mixed with a lot of noise, and denoising processing is required.
In this embodiment, a wavelet threshold method is used for denoising, after wavelet decomposition is performed on each index, a wavelet coefficient of an original signal is larger than that of noise, a proper threshold is selected to separate the signal from the noise, and then signal reconstruction is performed to achieve a denoising effect, and the specific steps are as follows:
a) selecting a Haar wavelet basis function to carry out 3-layer decomposition on the acquisition index value;
b) carrying out threshold processing on each detail coefficient according to the formula (2) to separate noise from normal data;
Figure BDA0003316543620000112
wherein lambda is wavelet coefficient threshold, omega is wavelet coefficient, omegaλSgn is a sign function for the denoised wavelet coefficient;
c) and reconstructing the data after the noise is removed to obtain the processed data.
(4) Data normalization processing
And (3) processing the indexes such as wind speed, air pressure and humidity according to the maximum value and the minimum value according to the formula (3):
Figure BDA0003316543620000121
wherein, value is the result after normalization, the value range is 0-1, and valuemax,valueminAnd valuetRepresenting the maximum, minimum and current values within the index data set, respectively.
The temperature normalization was performed according to the following equation (4):
Figure BDA0003316543620000122
wherein T is a normalized temperature value, TtIs the current temperature value.
The wind direction is between 0 and 360 degrees, and the wind direction can be normalized to be between 0 and 1 by combining sine and cosine values.
Step S3, establishing a feature vector according to the preprocessed data, and dividing the feature vector into a training sample and a testing sample;
step S4, training the KELM model by adopting training samples, and optimizing regularization coefficients and kernel parameters of the KELM model by utilizing a PCFOA algorithm to obtain an optimized KELM model;
the drosophila optimization algorithm (FOA) is a search type global optimization evolutionary algorithm evolved from foraging behavior of drosophila, has few FOA adjusting parameters, is simple to operate and is easy to practically apply.
On the basis of FOA, in the embodiment, a parallel chaotic drosophila algorithm PCFOA is designed based on a Spark platform, the process of searching for the optimal taste concentration position of each drosophila individual is parallelized, namely, a drosophila group with the size of N is divided into N independent parallel units, each single drosophila calculates the taste concentration value according to a fitness function and locally searches for an updated position from the current position, and the steps are handed to a Spark framework for parallel calculation.
The kernel extreme learning machine KELM is derived from an extreme learning machine ELM theory, the ELM is a training method of a single hidden layer forward neural network, the output weight of the network can be obtained through one-step calculation and analysis, the learning speed is high, and a regression function and the link weight of a hidden layer and an output layer are shown as a formula (14):
Figure BDA0003316543620000131
in the formula, x is sample input, f (x) is network output, H (x), H are hidden layer feature mapping matrixes mapped randomly, beta is a hidden layer and output layer connection weight obtained according to a generalized inverse matrix theory, I is a diagonal matrix, C is a penalty coefficient, and T is a sample target value vector.
Compared with ELM, KELM has stronger capability of solving the regression prediction problem, better generalization performance and higher prediction speed under the condition of equal prediction precision. The KELM can find the value of the output function only by knowing the form of the kernel function, and the values of the initial weights and the offsets of the hidden layers do not need to be set specifically when solving the value of the output function, and the output can be expressed as formula (15):
Figure BDA0003316543620000132
in the formula, C represents a regularization coefficient and is used for weighing output weight and training error; omegaELM=HHTRepresenting a kernel function matrix, and mapping all input samples from a dimensional space to a high-dimensional hidden layer feature space by using a kernel function; k (x)i,xj) Representing kernel functions, which may include gaussian kernel functions, polynomialsKernel functions, linear kernel functions, and the like; i represents an identity matrix; t represents the desired output.
The PCFOA is utilized to optimize the regularization coefficient C and the kernel parameter lambda of the KELM to obtain an optimized KELM model, and the method specifically comprises the following steps:
s301, initializing fruit fly population scale N, maximum iteration number max, search step length l and initial position (x)0,y0) And a chaotic parameter mu;
s302, dividing the fruit fly population into N independent parallel computing units;
s303, starting iterative optimization, and calculating the random distance of the drosophila individuals searching for food by using smell when each parallel computing unit is iterated for the first time as shown in the following formula (5):
Figure BDA0003316543620000141
after the first iteration is finished, the subsequent iteration uses the optimal solution information to obtain a new chaos step length l based on the two-dimensional Logistic chaos mapping chaotizationxAnd lyThe calculation formulas are shown in the following formulas (6) and (7):
Figure BDA0003316543620000142
Figure BDA0003316543620000143
where μ e (0,2.28), xn,yn∈(0,1);
S304, because the position of food is not known initially, the distance d from the current position of the individual fruit fly to the origin is calculatediAnd taking the reciprocal of the value as the taste concentration judgment value s of the individual fruit fliesiThe calculation formulas are shown in the following formulas (8) and (9):
Figure BDA0003316543620000144
si=1/di (9)
s305, judging the taste concentration SiSubstituting the obtained value into a taste concentration judgment function to calculate the taste concentration value s of the fruit fly individual at the current positioncurrent
scurrent=Function(si) (10)
S306, finding out the fruit fly with the optimal taste concentration value in all fruit fly individuals, namely the fruit fly with the highest or lowest concentration, and recording the taste concentration value and the corresponding position value of the fruit fly;
[s,index]=min(si)||max(si) (11)
s307, retaining the optimal taste concentration value S and its coordinates (x)index,yindex) And the visual flying is carried out to the optimal position recorded in the step S306 for position updating to form a new fruit fly population center:
Figure BDA0003316543620000151
sbest=s (13)
and S308, starting iterative optimization, repeatedly executing the steps S302-S306, then judging whether the optimal taste concentration value is superior to the historical optimal taste concentration value, and if so, executing the step S307 if the current iteration number is less than max.
And step S5, testing the optimized KELM model prediction effect by using the test sample to evaluate the prediction effect.
The following steps of predicting the wind power of a single fan by using a trained KELM model specifically include:
step S6, acquiring meteorological index data influencing the wind power of a single fan in a wind power plant;
and the meteorological index data comprise wind speed V, wind direction D, air pressure P, temperature T, humidity H and the like.
Step S7, preprocessing the meteorological index data by using a Spark platform;
the preprocessing step is the same as step S2.
Step S8, inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and outputting a wind power predicted value of the single fan by the KELM model;
and step S9, adding the wind power predicted values of all the fans in the wind power plant to obtain the total power predicted value of the wind power plant.
Through the embodiment, the Spark-based PCFOA-KELM wind power prediction method can be used for efficiently preprocessing sample data based on a Spark big data platform, and a parallel drosophila optimization algorithm PCFOA is designed, so that the modeling prediction speed of wind power is improved; the method adopts the KeLM of the kernel extreme learning machine to model and predict the wind power, and utilizes PCFOA to optimize and optimize the KELM model, thereby improving the prediction precision of the wind power.
The method fully utilizes the powerful capability of a Spark big data platform for analyzing and processing data in parallel, carries out high-efficiency preprocessing on the acquired data based on the Spark platform, designs a parallel parameter optimization method PCFOA, and improves the data preprocessing and optimization modeling speed; the PCFOA is used for optimizing parameters of the KELM, so that the prediction precision of the wind power is well improved; and finally, efficient and accurate prediction of wind power is realized based on a Spark big data platform.
In another embodiment, a Spark-based PCFOA-KELM wind power prediction device includes:
the acquisition module is configured to acquire meteorological index data influencing the wind power of a single fan in a wind power plant;
the data preprocessing module is configured to preprocess the meteorological index data by using a Spark platform;
the prediction module is configured to input the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and the KELM model outputs a wind power prediction value of the single fan;
the calculation module is configured to add the wind power predicted values of all the fans in the wind power plant to obtain a total power predicted value of the wind power plant;
the KELM model is trained by taking meteorological indexes affecting the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform, and parameter optimization is carried out on the KELM model by utilizing a PCFOA algorithm.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A PCFOA-KELM wind power prediction method based on Spark is characterized by comprising the following steps:
acquiring meteorological index data influencing the wind power of a single fan in a wind power plant;
preprocessing the meteorological index data by using a Spark platform;
inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and outputting a wind power predicted value of the single fan by the KELM model;
adding the wind power predicted values of all the fans in the wind power plant to obtain a total power predicted value of the wind power plant;
the KELM model is trained by taking meteorological indexes affecting the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform, and parameter optimization is carried out on the KELM model by utilizing a PCFOA algorithm.
2. The Spark-based PCFOA-KELM wind power prediction method according to claim 1, wherein the training method of the KELM model comprises:
acquiring meteorological index historical data and corresponding wind power historical data which affect the wind power of a single fan in a wind power plant;
preprocessing the acquired historical data by using a Spark platform;
establishing a feature vector according to the preprocessed data, and dividing the feature vector into a training sample and a test sample;
training the KELM model by adopting a training sample, and optimizing a regularization coefficient and a kernel parameter of the KELM model by utilizing a PCFOA algorithm to obtain an optimized KELM model;
and testing the optimized KELM model prediction effect by adopting a test sample.
3. The Spark-based PCFOA-KELM wind power prediction method according to claim 2, wherein the preprocessing the historical data by utilizing a Spark platform comprises:
and storing the historical data into an RDD (remote data description) to generate an RDD data set, dividing the RDD data set into a plurality of subdata sets, respectively carrying out data preprocessing on each subdata set, and distributing tasks for carrying out data preprocessing on the plurality of subdata sets to a plurality of corresponding executors to carry out parallel operation.
4. The Spark-based PCFOA-KELM wind power prediction method according to claim 2, wherein the preprocessing comprises: data abnormal value processing, data missing value processing, data noise reduction and normalization processing.
5. The Spark-based PCFOA-KELM wind power prediction method according to claim 4, wherein the data outlier processing comprises:
for the condition of sporadic data loss, the average value of the data values before and after the loss point is used as the data of the loss point;
for the condition of a small amount of data missing, an interpolation method is adopted for processing, and a specific interpolation formula is shown as the following formula:
Figure FDA0003316543610000021
in the formula, Pn、Pn+iAnd Pn+jThe values of the index at the nth, n + i and n + j moments respectively;
for the case of a large amount of data missing, the data for that period of time is discarded.
6. The Spark-based PCFOA-KELM wind power prediction method according to claim 4, wherein the data denoising comprises:
performing wavelet decomposition on the acquired index value by utilizing a Haar wavelet basis function;
based on the wavelet coefficient of the normal data signal being greater than the wavelet coefficient of the noise, separating the noise from the normal data according to the following equation based on the set threshold:
Figure FDA0003316543610000031
wherein, λ is wavelet coefficient threshold, ω is wavelet coefficient, ωλSgn is a sign function for the denoised wavelet coefficient;
and reconstructing the data after the noise is removed.
7. The Spark-based PCFOA-KELM wind power prediction method according to claim 4, wherein the meteorological indexes comprise wind speed, wind direction, air pressure, temperature and humidity, and the normalization process comprises:
according to the maximum value and the minimum value, respectively processing the wind speed, the air pressure and the humidity according to the following formula:
Figure FDA0003316543610000032
wherein, value is the result after normalization, the value range is 0-1, and valuemax,valueminAnd valuetRespectively representing a maximum value, a minimum value and a current value in the index data set;
the temperature is treated according to the following formula:
Figure FDA0003316543610000033
wherein T is a normalized temperature value, TtIs the current temperature value;
wind direction is normalized to between 0 and 1 using a combination of sine and cosine values.
8. The Spark-based PCFOA-KELM wind power prediction method according to claim 2, wherein the optimizing regularization coefficients and kernel parameters of the KELM model by using the PCFOA algorithm comprises:
dividing a fruit fly group with the size of N into N independent parallel units through RDD of a Spark platform, calculating a taste concentration value according to a fitness function by each single fruit fly, and locally searching for an updated position from a current position, wherein the taste concentration value and the current position locally searching for the updated position are calculated in parallel.
9. The Spark-based PCFOA-KELM wind power prediction method according to claim 2, wherein the optimizing regularization coefficients and kernel parameters of the KELM model by using the PCFOA algorithm comprises:
s1, initializing the fruit fly population size N, the maximum iteration number max, the search step length l and the initial position (x)0,y0) And a chaotic parameter mu;
s2, dividing the fruit fly population into N independent parallel computing units;
s3, starting iterative optimization, and for each parallel computing unit, when carrying out the first iteration, the random distance of the drosophila individual for searching food by using smell is shown as the following formula:
Figure FDA0003316543610000041
after the first iteration is finished, the subsequent iteration uses the optimal solution information to obtain a new chaos step length l based on the two-dimensional Logistic chaos mapping chaotizationxAnd lyThe calculation formula is shown as follows:
Figure FDA0003316543610000042
Figure FDA0003316543610000043
where μ e (0,2.28), xn,yn∈(0,1);
S4, calculating the distance d from the current position of the individual fruit fly to the original pointiAnd taking the reciprocal of the value as the taste concentration judgment value s of the individual fruit fliesiThe calculation formula is shown as follows:
Figure FDA0003316543610000051
si=1/di
s5, judging the taste concentration SiSubstituting the taste concentration into a taste concentration judgment function to calculate the taste concentration value of the fruit fly individual at the current position;
s6, finding out the fruit fly with the optimal taste concentration value in all fruit fly individuals, and recording the taste concentration value and the corresponding position value of the fruit fly;
s7, keeping the optimal taste concentration value S and its coordinate (x)index,yindex) And visually flying to the optimal position recorded in the step S6 to update the position, so as to form a new fruit fly population center:
Figure FDA0003316543610000052
sbest=s
s8, repeating steps S2-S6, judging whether the optimal taste concentration value is better than the historical optimal taste concentration value, if yes, executing step S7 if the current iteration number is less than max.
10. A Spark-based PCFOA-KELM wind power prediction device is characterized by comprising:
the acquisition module is configured to acquire meteorological index data influencing the wind power of a single fan in a wind power plant;
the data preprocessing module is configured to preprocess the meteorological index data by using a Spark platform;
the prediction module is configured to input the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and the KELM model outputs a wind power prediction value of the single fan;
the calculation module is configured to add the wind power predicted values of all the fans in the wind power plant to obtain a total power predicted value of the wind power plant;
the KELM model is trained by taking meteorological indexes affecting the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform, and parameter optimization is carried out on the KELM model by utilizing a PCFOA algorithm.
CN202111232892.4A 2021-10-22 2021-10-22 Spark-based PCFOA-KELM wind power prediction method and device Active CN114154679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111232892.4A CN114154679B (en) 2021-10-22 2021-10-22 Spark-based PCFOA-KELM wind power prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111232892.4A CN114154679B (en) 2021-10-22 2021-10-22 Spark-based PCFOA-KELM wind power prediction method and device

Publications (2)

Publication Number Publication Date
CN114154679A true CN114154679A (en) 2022-03-08
CN114154679B CN114154679B (en) 2024-01-26

Family

ID=80458567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111232892.4A Active CN114154679B (en) 2021-10-22 2021-10-22 Spark-based PCFOA-KELM wind power prediction method and device

Country Status (1)

Country Link
CN (1) CN114154679B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570250A (en) * 2016-11-02 2017-04-19 华北电力大学(保定) Power big data oriented microgrid short-period load prediction method
CN109934330A (en) * 2019-03-04 2019-06-25 温州大学 The method of prediction model is constructed based on the drosophila optimization algorithm of diversified population
CN111639695A (en) * 2020-05-26 2020-09-08 温州大学 Method and system for classifying data based on improved drosophila optimization algorithm
CN113466615A (en) * 2021-06-17 2021-10-01 三峡大学 Drosophila optimization algorithm-based post-fault wave recording data synchronization method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570250A (en) * 2016-11-02 2017-04-19 华北电力大学(保定) Power big data oriented microgrid short-period load prediction method
CN109934330A (en) * 2019-03-04 2019-06-25 温州大学 The method of prediction model is constructed based on the drosophila optimization algorithm of diversified population
CN111639695A (en) * 2020-05-26 2020-09-08 温州大学 Method and system for classifying data based on improved drosophila optimization algorithm
CN113466615A (en) * 2021-06-17 2021-10-01 三峡大学 Drosophila optimization algorithm-based post-fault wave recording data synchronization method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
吴嘉文: ""基于BP神经网络的风电场风电功率预测系统的设计与实现"", 中国优秀硕士学位论文全文数据库工程科技Ⅱ辑, pages 13 - 15 *
许国根等: "《最优化方法及其MATLAB实现》", 北京航空航天大学出版社, pages: 398 - 399 *
赵磊: ""基于云计算的风电场短期风功率预测方法的研究"", 中国优秀硕士学位论文全文数据库工程科技Ⅱ辑, pages 26 - 45 *
赵鹏等: ""基于PSO-KELM 的风功率预测研究"", 电测与仪表, vol. 57, no. 11, pages 24 - 29 *

Also Published As

Publication number Publication date
CN114154679B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN111814956B (en) Multi-task learning air quality prediction method based on multi-dimensional secondary feature extraction
CN110751318A (en) IPSO-LSTM-based ultra-short-term power load prediction method
Zhu et al. Selective ensemble based on extreme learning machine and improved discrete artificial fish swarm algorithm for haze forecast
CN109816144B (en) Short-term load prediction method for distributed memory parallel computing optimized deep belief network
CN111127246A (en) Intelligent prediction method for transmission line engineering cost
CN112180471B (en) Weather forecasting method, device, equipment and storage medium
CN106980906B (en) Spark-based Ftrl voltage prediction method
CN114241230A (en) Target detection model pruning method and target detection method
CN114462718A (en) CNN-GRU wind power prediction method based on time sliding window
CN111222689A (en) LSTM load prediction method, medium, and electronic device based on multi-scale temporal features
CN115359338A (en) Sea surface temperature prediction method and system based on hybrid learning model
CN114548591A (en) Time sequence data prediction method and system based on hybrid deep learning model and Stacking
CN110738363B (en) Photovoltaic power generation power prediction method
CN116799796A (en) Photovoltaic power generation power prediction method, device, equipment and medium
Flores et al. Wind speed time series prediction with deep learning and data augmentation
Wen et al. MapReduce-based BP neural network classification of aquaculture water quality
CN116706907B (en) Photovoltaic power generation prediction method based on fuzzy reasoning and related equipment
CN112819246A (en) Energy demand prediction method for optimizing neural network based on cuckoo algorithm
Mohd et al. Comparative study of rainfall prediction modeling techniques (A case study on Srinagar, J&K, India)
Cintra et al. Global data assimilation using artificial neural networks in SPEEDY model
CN116822743A (en) Wind power prediction method based on two-stage decomposition reconstruction and error correction
CN114154679B (en) Spark-based PCFOA-KELM wind power prediction method and device
CN115794405A (en) Dynamic resource allocation method of big data processing framework based on SSA-XGboost algorithm
CN114444763A (en) Wind power prediction method based on AFSA-GNN
CN114139783A (en) Wind power short-term power prediction method and device based on nonlinear weighted combination

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: No. 38, New Model Road, Gulou District, Nanjing City, Jiangsu Province, 210000

Patentee after: Nanjing Nanzi Huadun Digital Technology Co.,Ltd.

Country or region after: China

Address before: No.39 Shuige Road, Jiangning District, Nanjing City, Jiangsu Province, 211100

Patentee before: NANJING HUADUN POWER INFORMATION SECURITY EVALUATION CO.,LTD.

Country or region before: China