CN114154679B

CN114154679B - Spark-based PCFOA-KELM wind power prediction method and device

Info

Publication number: CN114154679B
Application number: CN202111232892.4A
Authority: CN
Inventors: 经正俊; 齐刚; 宋坤; 张世磊; 马驰源; 吉书强; 周安; 倪晓锋; 王文贵; 管超; 甘露平
Original assignee: Nanjing Huadun Power Information Security Evaluation Co Ltd
Current assignee: Nanjing Nanzi Huadun Digital Technology Co ltd
Priority date: 2021-10-22
Filing date: 2021-10-22
Publication date: 2024-01-26
Anticipated expiration: 2041-10-22
Also published as: CN114154679A

Abstract

The invention discloses a Spark-based PCFOA-KELM wind power prediction method and device. The method comprises the following steps: collecting meteorological index data affecting the wind power of a single fan in a wind power plant; preprocessing the meteorological index data by using a Spark platform; inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, wherein the KELM model outputs a wind power predicted value of the single fan; and adding the wind power predicted values of all fans in the wind power plant to obtain the total power predicted value of the wind power plant. The KELM model is based on a Spark platform, is trained by taking meteorological indexes influencing the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples, and is obtained by carrying out parameter optimization on the KELM model by using a PCFOA algorithm. The wind power prediction method can realize high-efficiency and accurate prediction of wind power, and simultaneously well improves the running speed.

Description

Spark-based PCFOA-KELM wind power prediction method and device

Technical Field

The invention relates to a Spark-based PCFOA (parallel chaos drosophila optimization algorithm, parallel ChaoticFruit Fly Optimization Algorithm) -KELM (kernel extreme learning machine ) wind power prediction method and device, and belongs to the field of wind power generation.

Background

The wind power generation is influenced by meteorological factors, and the output power of the wind power system has the characteristics of fluctuation, randomness, intermittence and the like along with the difference of seasons and geographic positions. The wind power is predicted with high precision, which is a basis for evaluating the running state of the wind power plant, is an important basis for planning, designing and scheduling the running of the power grid, and has important significance for guaranteeing the safe and stable running of the power grid.

In the research of wind power prediction technology, many universities, enterprises, research institutions and the like begin to conduct, and the currently proposed wind power prediction method can be roughly divided into: physical methods and statistical methods.

The physical method is based on data such as numerical weather forecast NWP (Numerical Weather Prediction) and topography information, wind speed and other data are obtained through simulation models (such as a microscopic meteorological model and a CFD model), and then a prediction result is obtained by combining an actual power curve. The method is suitable for the wind power plant with complex terrain, does not need a large amount of data, but is very complex in calculation, and the prediction accuracy is difficult to guarantee because the NWP resolution is not easy to reach the use requirement. The statistical method is to build a prediction model based on historical data to predict by using a machine learning method, such as a neural network, a support vector machine, a time sequence method and the like. The method can adapt to the position of the wind power plant and reduce errors in a self-adaptive manner, but needs a large amount of historical data, and has the problems of low training optimizing speed and the like.

With the advent of the 4.0 era of industry, big data technology has become increasingly used in smart power plants. In view of the characteristics of large wind power prediction data volume, complex data types and the like, students propose to apply a large data technology to process a large-scale data set, so that training and prediction time is reduced, and prediction accuracy is improved.

Among the solutions for modeling based on big data platforms, the solutions currently existing are: processing and analyzing wind power prediction input data through a Hadoop platform, and then modeling and predicting by combining a BP neural network and an SVM to establish a combined model and the like; and establishing a parallel SVM model based on the Hadoop platform, and carrying out SVM optimization and power prediction through parallel calculation.

On the one hand, the traditional physical method has high algorithm complexity, very complex calculation and prediction accuracy basically depends on weather data prediction conditions, and the prediction accuracy needs to be improved; the statistical method needs to process a large amount of historical data, has overlong data training optimizing time, is difficult to meet the predicted timeliness, and is practically applied. On the other hand, in the algorithm for wind power prediction by using the big data technology, the technical advantages of the big data platform cannot be fully utilized, and the big data related technology is only applied to the preprocessing of data or the parallel model calculation. In addition, the traditional wind power prediction is performed for the whole power plant, and along with the expansion of the wind power plant, the accuracy of predicting the modeling of the whole wind power plant is not high enough due to the fact that the meteorological conditions and the geographic conditions of the positions of the wind turbines are different.

Disclosure of Invention

The invention aims to provide a Spark-based PCFOA-KELM wind power prediction method and device, which are used for solving the problem of how to rapidly and accurately predict wind power and improving running speed while guaranteeing prediction accuracy.

In order to achieve the above purpose, the invention adopts the following technical scheme:

in one aspect, a Spark-based PCFOA-KELM wind power prediction method comprises:

collecting meteorological index data affecting the wind power of a single fan in a wind power plant;

preprocessing the meteorological index data by using a Spark platform;

inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, wherein the KELM model outputs a wind power predicted value of the single fan;

adding the wind power predicted values of all fans in the wind power plant to obtain a total power predicted value of the wind power plant;

the KELM model is obtained by training the KELM model by using meteorological indexes influencing the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform and performing parameter optimization on the KELM model by using a PCFOA algorithm.

Further, the training method of the KELM model comprises the following steps:

acquiring weather index historical data and corresponding wind power historical data which influence the wind power of a single wind turbine in a wind power plant;

preprocessing the acquired historical data by using a Spark platform;

establishing a feature vector according to the preprocessed data, and dividing the feature vector into a training sample and a test sample;

training the KELM model by adopting a training sample, and optimizing regularization coefficient and nuclear parameter of the KELM model by using a PCFOA algorithm to obtain an optimized KELM model;

and testing the prediction effect of the optimized KELM model by adopting a test sample.

Further, the preprocessing the historical data by using the Spark platform includes:

and storing the historical data into RDD to generate an RDD data set, dividing the RDD data set into a plurality of sub-data sets, respectively carrying out data preprocessing on each sub-data set, and distributing tasks of carrying out data preprocessing on the plurality of sub-data sets to a plurality of corresponding executors for parallel operation.

Further, the preprocessing includes: data outlier processing, data missing value processing, data noise reduction and normalization processing.

Further, the data outlier processing includes:

for the situation of sporadic data missing, adopting an average value of data values before and after the missing point as missing point data;

for the case of a small amount of data missing, an interpolation method is adopted for processing, and a specific interpolation formula is shown as follows:

wherein P is _n 、P _n+i And P _n+j Values at the n, n+i and n+j moments of the index respectively;

for the case of a large amount of data missing, the data for that period of time is discarded.

Further, the data noise reduction includes:

performing wavelet decomposition on the acquired index values by using a Haar wavelet basis function;

according to the wavelet coefficient of the normal data signal being larger than the wavelet coefficient of the noise, based on the set threshold, the noise is separated from the normal data according to the following formula:

where λ is the wavelet coefficient threshold, ω is the wavelet coefficient, ω _λ For the denoised wavelet coefficients, sgn is a sign function;

and reconstructing the data after noise removal.

Further, the weather indicators include wind speed, wind direction, air pressure, temperature and humidity, and the normalization process includes:

according to the maximum value and the minimum value, wind speed, air pressure and humidity are respectively processed according to the following formula:

wherein, the value is the normalized result, the value range is 0-1, and the value is _max ,value _min And value _t Respectively representing a maximum value, a minimum value and a current value in the index data set;

the temperature was treated according to the following formula:

wherein T is the normalized temperature value, T _t Is the current temperature value;

wind direction is normalized to between 0 and 1 using a combination of sine and cosine values.

Further, the optimizing regularization coefficients and kernel parameters of the KELM model by using the PCFOA algorithm comprises the following steps:

dividing a population of Drosophila of size N into N independent parallel units by RDD of a Spark platform, each individual Drosophila calculating a taste concentration value according to an fitness function and locally searching for an updated position from the current position, the taste concentration value and the current position locally searching for the updated position performing the calculation in parallel.

s1, initializing a Drosophila population scale N, a maximum iteration number max, a search step length l and an initial position (x) ₀ ,y ₀ ) And a chaotic parameter mu;

s2, dividing the Drosophila population into N independent parallel computing units;

s3, starting iterative optimization, and for each parallel computing unit, carrying out the first iteration, wherein the random distance for the drosophila individual to search food by using smell is shown as follows:

after the first iteration is completed, the subsequent iteration uses the optimal solution information to obtain a new chaotic step length l based on two-dimensional Logistic chaotic mapping chaotization _x And l _y The calculation formula is shown as follows:

wherein mu is E (0,2.28), x _n ,y _n ∈(0,1)；

S4, calculating the distance d from the current position of the drosophila individual to the origin _i And taking the reciprocal thereof as the individual taste concentration determination value s of the fruit fly _i The calculation formula is shown as follows:

s _i ＝1d _i

s5, judging the taste concentration to be S _i Substituting the taste concentration judgment function to calculate the taste concentration value of the drosophila individual at the current position;

s6, finding out the fruit fly with the optimal taste concentration value in all fruit fly individuals, and recording the taste concentration value and the corresponding position value of the fruit fly;

s7, maintaining the optimal taste concentration value S and its coordinates (x _index ,y _index ) And utilizing vision to fly to the optimal position recorded in the step S6 to update the position, so as to form a new drosophila population center:

s _best ＝s

and S8, repeatedly executing the steps S2 to S6, judging whether the optimal taste concentration value is better than the historical optimal taste concentration value, and executing the step S7 if the optimal taste concentration value is smaller than max and the current iteration number is smaller than max.

In another aspect, a Spark-based PCFOA-KELM wind power prediction apparatus includes:

the acquisition module is configured to acquire meteorological index data affecting the wind power of a single fan in the wind power plant;

the data preprocessing module is configured to preprocess the weather index data by using a Spark platform;

the prediction module is configured to input the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and the KELM model outputs a wind power predicted value of the single fan;

the calculation module is configured to add the wind power predicted values of all fans in the wind power plant to obtain a total power predicted value of the wind power plant;

Compared with the prior art, the invention has the beneficial technical effects that:

the invention fully utilizes the powerful parallel analysis processing data capacity of the Spark big data platform, carries out high-efficiency preprocessing on the acquired data based on the Spark platform, designs a parallel parameter optimizing method PCFOA, and improves the data preprocessing and optimizing modeling speed; meanwhile, by utilizing the characteristics of better generalization performance of KELM than ELM, SVM and other machine learning algorithms, higher calculation speed and the like when better or similar prediction accuracy is obtained, based on a Spark big data platform, PCFOA is utilized to perform parameter optimization on the KELM, so that the prediction accuracy of wind power is effectively improved, and the running speed is well improved.

Drawings

FIG. 1 is a flowchart of a Spark-based PCFOA-KELM wind power prediction method according to an embodiment of the present invention.

Detailed Description

The invention is further described below in connection with specific embodiments. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.

As described above, in terms of predicting the power of the wind farm, the traditional physical method has high algorithm complexity, very complex calculation and the prediction accuracy basically depends on the weather data prediction condition; the statistical method needs to process a large amount of historical data, has overlong data training optimizing time, and is difficult to meet the timeliness of prediction; in the algorithm for wind power prediction by using the big data technology, the technical advantages of the big data platform cannot be fully utilized.

Therefore, the invention provides a parallel chaos drosophila optimization algorithm PCFOA-based parallel kernel extreme learning machine KELM wind power algorithm for modeling and predicting a single fan, and then taking the sum of the predicted powers of all fans as the total output of a wind power plant, thereby improving the wind power prediction precision; and the function characteristics of the Spark platform are fully utilized, and the Spark is utilized for correlation realization in the two aspects of data processing analysis and parallel model calculation, so that the prediction accuracy is effectively improved, and meanwhile, the running speed is well improved.

In one embodiment, a Spark-based PCFOA-KELM wind power prediction method, as shown in FIG. 1, comprises:

step S1, weather index historical data and corresponding wind power historical data which influence the wind power of a single wind turbine in a wind power plant are obtained;

the meteorological indexes such as wind speed V, wind direction D, air pressure P, temperature T, humidity H and the like are taken as factors influencing wind power.

S2, preprocessing the acquired historical data by using a Spark platform;

because some uncertain factors may cause data abnormality or missing in actual data measurement and collection, the index value data acquired in step S1 needs to be preprocessed.

In this embodiment, data preprocessing is performed based on the Spark platform.

Spark is a big data parallel computing framework based on memory computation. Spark is based on memory computation and is built on a unified and abstract RDD (resilient distributed data set, resilient Distributed Datasets), so that the real-time performance of data processing in a big data environment is improved, high fault tolerance and high scalability are ensured, and a user is allowed to deploy Spark on a large amount of cheap hardware to form a cluster. RDD is a fault-tolerant, parallel data structure, which allows users to explicitly store data to disk and memory, and can control the partitioning of data, and large data analysis and processing has extremely high performance.

The method comprises the steps of storing original data into RDD, decomposing the RDD into a plurality of tasks with the same logic after triggering actions by using MapReduce principle, and distributing the tasks to a plurality of Executor for parallel execution.

In this embodiment, the history data obtained in step S1 is stored in RDD to generate an RDD dataset, the RDD dataset is divided into a plurality of sub-datasets, each sub-dataset is respectively subjected to data preprocessing, and tasks of the data preprocessing of the plurality of sub-datasets are distributed to a plurality of corresponding executors for parallel operation.

Wherein, data preprocessing includes: data outlier processing, data missing value processing, data noise reduction and normalization processing. The method comprises the following steps:

(1) Data outlier handling

And (3) referring to the reasonable range table 1 of the test values for preprocessing the data of each index data obtained, and eliminating the data which are not in the range.

Table 1 parameter rational range table

(2) Data missing value handling

For the condition of the index data missing, in order to ensure the continuity and the authenticity of the data, the following conditions are respectively processed:

a) For the situation of sporadic data missing, adopting an average value of data values before and after the missing point as missing point data;

b) For the condition of a small amount of data missing, an interpolation method is adopted for processing, and a specific interpolation formula is shown as a formula (1):

c) For the case of a large amount of data missing, such as caused by fan maintenance or planned limiting, the time data should be discarded as invalid data.

(3) Data denoising process

Due to the influence of factors such as environment, the acquired index data is mixed with a lot of noise, and denoising processing is needed.

In this embodiment, denoising is performed by using a wavelet threshold method, after wavelet decomposition is performed on each index, the wavelet coefficient of the original signal is larger than the wavelet coefficient of the noise, and a proper threshold is selected to separate the signal and the noise, and then the signal is reconstructed, so as to achieve the effect of denoising, which specifically comprises the following steps:

a) Selecting a Haar wavelet basis function to perform 3-layer decomposition on the acquisition index value;

b) Threshold processing is carried out on each detail coefficient according to the formula (2), and noise is separated from normal data;

c) Reconstructing the data after noise removal to obtain processed data.

(4) Data normalization

The indexes such as wind speed, air pressure, humidity and the like are treated according to a formula (3) by referring to the maximum value and the minimum value:

wherein, the value is the normalized result, the value range is 0-1, and the value is _max ,value _min And value _t Representing the maximum, minimum and current values within the index data set, respectively.

The temperature normalization is performed according to the following formula (4):

wherein T is the normalized temperature value, T _t Is the current temperature value.

The wind direction is between 0 and 360 degrees, and can be normalized to be between 0 and 1 by combining sine and cosine values.

Step S3, a feature vector is established according to the preprocessed data, and is divided into a training sample and a test sample;

s4, training the KELM model by using a training sample, and optimizing regularization coefficients and nuclear parameters of the KELM model by using a PCFOA algorithm to obtain an optimized KELM model;

the fruit Fly Optimization Algorithm (FOA) is a search type global optimization evolution algorithm which is evolved from fruit fly foraging behaviors, has fewer FOA adjusting parameters, is simple to operate and is easy to be practically applied.

Based on FOA, in the embodiment, a parallel chaos fruit fly algorithm PCFOA is designed based on a Spark platform, the process of searching the optimal taste concentration position of individual fruit flies is parallelized, namely, a fruit fly population with the size of N is divided into N independent parallel units, each individual fruit fly calculates the taste concentration value according to a fitness function and locally searches for an updated position from the current position, and the steps are performed by the Spark frame to perform parallel calculation.

The kernel extreme learning machine KELM is derived from an extreme learning machine ELM theory, the ELM is a training method of a single hidden layer forward neural network, the output weight of the network can be obtained through one-step calculation and analysis, the kernel extreme learning machine KELM has high learning speed, and the regression function and the hidden layer and output layer link weight are shown as a formula (14):

wherein x is sample input, f (x) is network output, H (x) and H are hidden layer feature mapping matrixes of random mapping, beta is a hidden layer and output layer connection weight obtained according to a generalized inverse matrix theory, I is a diagonal matrix, C is a punishment coefficient, and T is a sample target value vector.

Compared with ELM, KELM has stronger capability of solving the regression prediction problem, better generalization performance and higher prediction speed under the condition of equal prediction precision. The KELM can calculate the value of the output function by knowing the form of the kernel function, and the initial weight and offset values of the hidden layer are not required to be specially set when the output function value is solved, and the output can be expressed as the formula (15):

in the formula, C represents a regularization coefficient and is used for balancing output weight and training error; omega shape _ELM ＝HH ^T Representing a kernel function matrix, using which all input samples can be mapped from dimensional space to Gao Weiyin layers of feature space; k (x) _i ,x _j ) The representation kernel functions may include gaussian kernel functions, polynomial kernel functions, linear kernel functions, and the like; i represents an identity matrix; t represents the desired output.

And optimizing a regularization coefficient C and a nuclear parameter lambda of the KELM by using the PCFOA to obtain an optimized KELM model, wherein the method comprises the following specific steps of:

s301, initializing a Drosophila population scale N, a maximum iteration number max, a search step length l and an initial position (x ₀ ,y ₀ ) And a chaotic parameter mu;

s302, dividing a drosophila population into N independent parallel computing units;

s303, starting iterative optimization, and calculating the random distance of the drosophila individual searching food by using smell when each parallel calculation unit performs the first iteration, wherein the random distance is shown as the following formula (5):

after the first iteration is completed, the subsequent iteration uses the optimal solution information to obtain a new chaotic step length l based on two-dimensional Logistic chaotic mapping chaotization _x And l _y The calculation formulas are shown in the following formulas (6) and (7):

wherein mu is E (0,2.28), x _n ,y _n ∈(0,1)；

S304, since the position of the food is not known initially, the distance d between the current position of the drosophila individual and the origin is calculated first _i And taking the reciprocal thereof as the individual taste concentration determination value s of the fruit fly _i The calculation formulas are shown in the following formulas (8) and (9):

s _i ＝1/d _i (9)

s305, judging the taste concentration of the flavor _i Substituting into the taste concentration determination function to calculate the taste concentration value s of the Drosophila individual at the current position _current ：

s _current ＝Function(s _i ) (10)

S306, finding out the fruit fly with the optimal taste concentration value in all fruit fly individuals, namely the individual with the highest concentration or the lowest concentration, and recording the taste concentration value and the corresponding position value of the fruit fly;

[s,index]＝min(s _i )||max(s _i ) (11)

s307, retaining the optimal taste concentration value S and its coordinates (x _index ,y _index ) And utilizing vision to fly to the optimal position recorded in the step S306 to update the position, so as to form a new drosophila population center:

s _best ＝s (13)

and S308, starting iterative optimization, repeatedly executing the steps S302-S306, judging whether the optimal taste concentration value is better than the historical optimal taste concentration value, and executing the step S307 if the optimal taste concentration value is smaller than max and the current iteration number is smaller than max.

And S5, testing the prediction effect of the optimized KELM model by adopting a test sample so as to evaluate the prediction effect.

Next, the wind power of a single fan is predicted by using a trained KELM model, which specifically comprises:

s6, collecting meteorological index data affecting the wind power of a single fan in a wind power plant;

weather-indicating data including wind speed V, wind direction D, barometric pressure P, temperature T, and humidity H.

S7, preprocessing the weather index data by using a Spark platform;

the preprocessing step is the same as step S2.

S8, inputting the preprocessed meteorological index data into a pre-trained KELM model corresponding to the single fan, and outputting a wind power predicted value of the single fan by the KELM model;

and S9, adding the wind power predicted values of all fans in the wind power plant to obtain the total power predicted value of the wind power plant.

Through the embodiment, the Spark-based PCFOA-KELM wind power prediction method can be used for efficiently preprocessing sample data based on a Spark big data platform, and a parallel Drosophila optimization algorithm PCFOA is designed, so that the modeling prediction speed of wind power is improved; the kernel extreme learning machine KELM is used for modeling and predicting the wind power, and PCFOA is used for parameter optimization and optimizing the KELM model, so that the wind power prediction precision is improved.

The invention fully utilizes the powerful parallel analysis processing data capacity of the Spark big data platform, carries out high-efficiency preprocessing on the acquired data based on the Spark platform, designs a parallel parameter optimizing method PCFOA, and improves the data preprocessing and optimizing modeling speed; the method has the advantages that the KELM is better in generalization performance than the ELM, SVM and other machine learning algorithms, the calculation speed is higher when better or similar prediction accuracy is obtained, the PCFOA is used for carrying out parameter optimization on the KELM, and the wind power prediction accuracy is improved well; and finally, the wind power is efficiently and accurately predicted based on the Spark big data platform.

In another embodiment, a Spark-based PCFOA-key wind power prediction apparatus includes:

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. A Spark-based PCFOA-KELM wind power prediction method is characterized by comprising the following steps:

preprocessing the meteorological index data by using a Spark platform;

the KELM model is obtained by training the KELM model by taking meteorological indexes influencing the wind power of a single fan in a wind power plant and corresponding wind power historical data as training samples based on a Spark platform and performing parameter optimization on the KELM model by using a PCFOA algorithm;

the training method of the KELM model comprises the following steps:

preprocessing the acquired historical data by using a Spark platform;

testing the prediction effect of the optimized KELM model by adopting a test sample;

the optimizing regularization coefficient and nuclear parameter of the KELM model by using PCFOA algorithm comprises the following steps:

dividing a drosophila population with the size of N into N independent parallel units through RDD of a Spark platform, calculating a taste concentration value by each individual drosophila according to a fitness function and locally searching for an updated position from the current position, wherein the taste concentration value and the locally searching for the updated position at the current position are parallelly calculated;

wherein mu is E (0,2.28), x _n ,y _n ∈(0,1)；

S4, calculating the distance d from the current position of the drosophila individual to the origin _i And take its reciprocalAs a taste concentration determination value s of Drosophila individuals _i The calculation formula is shown as follows:

s _i ＝1/d _i

s _best ＝s

2. The Spark-based PCFOA-key wind power prediction method according to claim 1, wherein the preprocessing the historical data by using a Spark platform comprises:

3. The Spark-based PCFOA-key wind power prediction method according to claim 1, wherein the preprocessing comprises: data outlier processing, data missing value processing, data noise reduction and normalization processing.

4. A Spark-based PCFOA-key wind power prediction method according to claim 3, characterized in that the data outlier processing comprises:

5. A Spark-based PCFOA-key wind power prediction method according to claim 3, characterized in that the data noise reduction comprises:

and reconstructing the data after noise removal.

6. A Spark-based PCFOA-key wind power prediction method according to claim 3, wherein the meteorological indexes include wind speed, wind direction, air pressure, temperature and humidity, and the normalization process comprises:

the temperature was treated according to the following formula:

7. Spark-based PCFOA-KELM wind power prediction device is characterized by comprising:

the training method of the KELM model comprises the following steps:

preprocessing the acquired historical data by using a Spark platform;

wherein mu is E (0,2.28), x _n ,y _n ∈(0,1)；

s _i ＝1/d _i

s _best ＝s