CN105205563A - Short-term load predication platform based on large data - Google Patents

Short-term load predication platform based on large data Download PDF

Info

Publication number
CN105205563A
CN105205563A CN201510628197.8A CN201510628197A CN105205563A CN 105205563 A CN105205563 A CN 105205563A CN 201510628197 A CN201510628197 A CN 201510628197A CN 105205563 A CN105205563 A CN 105205563A
Authority
CN
China
Prior art keywords
data
load
module
prediction
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510628197.8A
Other languages
Chinese (zh)
Other versions
CN105205563B (en
Inventor
高军
侯广松
李喜同
王健
韩岩
甄颖
邓帅
荆树志
马松
吴倩红
韩蓓
李国杰
王启龙
尹中发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Heze Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
Shanghai Jiaotong University
Heze Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, Heze Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical Shanghai Jiaotong University
Priority to CN201510628197.8A priority Critical patent/CN105205563B/en
Publication of CN105205563A publication Critical patent/CN105205563A/en
Application granted granted Critical
Publication of CN105205563B publication Critical patent/CN105205563B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a short-term load predication platform based on large data. A Hadoop cluster serves as architecture of the short-term load predication platform based on the large data. Locally-weighed linear regression parallel load predication is achieved on the platform through Mapreduce. The platform comprises a data integration module, a load predication module, a result visualization module and a user management module. The data integration module completes the complete data processing flow from data acquisition and loading relevant to loads and data storage to the final multi-factor large data integration fusion technology. The load predication module achieves parallel load prediction through Mapreduce of a locally-weighed linear regression algorithm, and self-learning and self-adapting load prediction is achieved through parameter adjustment. The result visualization module is a specific showing layer of a platform prediction result, and a real-time analysis technology is dynamically shown. The user management module is a safety mechanism layer of the platform, and achieves safe, reliable and efficient running of the load prediction platform.

Description

A kind of short-term load forecasting platform based on large data
Technical field
The present invention relates to a kind of short-term load forecasting platform based on large data being framework with Hadoop cluster.
Background technology
Load Prediction In Power Systems is one of important process of the administrative authoritys such as electric power system dispatching, electricity consumption, plan and planning, load prediction accurately, be conducive to the start and stop of the arrangement electrical network internal generator group of economical rationality, keep the security and stability of operation of power networks, reduce unnecessary rotation idle capacity; Be conducive to the management of power use, reasonably arrange power system operating mode and unit maintenance scheduling, ensure normal production and the life of society.
In recent years, along with scientific and technological circle, academia and socioeconomic high speed development, large data technique becomes global study hotspot, and the fast development of corresponding sensed communication technology, intelligent grid also form the large data of electric power.Tradition Short-Term Power Load Forecasting System mainly contains the short-term load forecasting method based on neural network (ANN), based on the short-term load forecasting method of fuzzy logic inference, short-term load forecasting is applied to based on the nonlinear system of chaology and method, the analytical approachs such as combined prediction, its predetermined speed and precision can not meet the requirement under large data environment, need to set up a complete load prediction platform based on large data analysis, realize from Data import, data processing, load prediction, FEEDBACK CONTROL, visual integration, parallelization, adaptive load prediction flow process.
Summary of the invention
The object of this invention is to provide a kind of can high speed processing multiple magnanimity electricity consumption data the easily extensible load prediction platform that can realize from Data import, data processing, load prediction, state modulator, visual integration, parallelization, adaptive load prediction flow process, to solve current load prediction system data process limited amount, predetermined speed is low, handle data structures is single problem.
Technical solution of the present invention is as follows:
Based on a short-term load forecasting platform for large data, adopt Hadoop cluster to be framework, its feature is: comprise Data Integration module, load prediction module, result visualization module,
Described Data Integration module, for gathering, store and processing the multi-source heterogeneous data relevant to load, and these multi-source heterogeneous data are merged and historical data correction, fill up and normalized;
Described load prediction module, according to load period analysis theories, similarity analysis theory and the factor correlation analysis affecting load, local weighted linear regression algorithm Mapreduceization is carried out load prediction, calculating relative error, daily load accuracy rate carry out adjustment algorithm parameter to realize parameter regulable control, obtain the local weighted linear regression algorithm of final optimal parameter, carry out load prediction and predicted the outcome;
Described result visualization module, for showing load prediction results.
(1) Data Integration module
(1) Data import module: the multi-source heterogeneous data by collecting: the Data import of structural data (comprising historical load data, weather data, step price, tou power price, point load character electricity price, all attributes and holiday information, traffic data), semi-structured data (comprising electrovalence policy, economic population tables of data), unstructured data (comprising GIS data) three types is in distributed data base HBase;
(2) data memory module, for semi-structured, destructuring small documents and the structural data complicated and changeable of magnanimity, use the key-value of HBase to store, for the file of larger single three kinds of data types, be directly stored in HDFS file system;
(3) multistage combined index module, finds required data for the quick search criterion by user.One-level index adopts multidimensional R to set basic structure, according to non-traditional geographical classification rule, data object after cleaning is divided into multiple subspace, a node of the corresponding R tree of every sub spaces, nonleaf node deposits the minimum enclosed rectangle (MBR) of all subtrees of node, deposits the MBR that each spatial object is corresponding in leaf node; Secondary and following index adopt the clustering objects based on proximity density (LCF), realize the object after cluster divide according to the degree of closeness of relative density;
(4) data processing module, realizes data prediction, for all multi-source heterogeneous data, selects multi-kernel function learning method to merge; For the missing data in historical load data, adopt interpolation completion data; For the abnormal data in historical load data, adopt in length and breadth method to correct abnormal data; Finally the data after all process are normalized.
(2) load prediction module
According to load period analysis theories, similarity analysis theory and the factor correlation analysis affecting load, local weighted linear regression algorithm Mapreduceization is carried out load prediction, calculate relative error, daily load accuracy rate and carry out evaluation prediction method, and according to the algorithm parameter of assessment result adjustment to realize parameter regulable control.Specific implementation step is:
Step 1, choose experimental data eigenwert: theoretical and affect the factor correlation analysis of load according to load period analysis theories, similarity analysis, choose historical load data, history weather data, all attribute informations, prediction day data as experimental data, the eigenvalue affecting load be moment every day, daily mean temperature, week attribute.
Step 2, structure experiment sample training set, test set: apply polynary local weighted linear regression model (LRM) and carry out load prediction, the relation obtained between load l and influence factor x needs to construct training set and test set, the structure of training set, test set is carried out, structural form according to the eigenwert attribute affecting load chosen in step 1:
[moment x 1, temperature x 2, week attribute x 3]
Moment x 1: every day, every 30 minutes sampling load datas, in one day 48 moment, is expressed as follows totally:
Temperature x 2: daily mean temperature
Week attribute x 3: represent that Monday is to Sun. by numeral 1 to 7.
Prediction mode: load prediction a few days ago, training sample is the load data of every 30 minutes of every day of the first two years day to be predicted, and in order to predict the load of every 30 minutes of day to be predicted, namely test sample book is the data of every 30 minutes of day to be predicted.
Final training set: (x i1, x i2, x i3, l i), i=1,2 ..., n, to variable x i1, x i2, x i3, l ido n time to observe, obtain n training sample.
Final test set: (x j1, x j2, x j3), j=1,2 ..., 48, totally 48 load point to be predicted.
Step 3, experimental data load: Experiment Training collection and test set file are independently divided into several data blocks by Hadoop cluster, deposit in distributed file system HDFS.
Set up inlet flow object, the training set text in HDFS is input to obtain HDFS example in Stream, read training set text line by line, carry out classification conversion and be designated as: traindata [i]=(x i1, x i2, x i3), L [i]=l i, i=1,2 ..., n, n are training sample number.Test set text then inputs in Map process.
Step 4, realize Map process: the distance realized in KNN algorithm between all points to be predicted and all sample points calculates.Be input as test set text, be designated as testdata [j]=(x j1, x j2, x j3), j=1,2 ..., m.Specific implementation step is:
4A. defines Map function-output and type thereof;
Test set text is carried out type conversion by 4B.;
4C. calculates distance distance [j] [i] between a jth test sample book and i-th training sample::
d i s tan c e [ j ] [ i ] = | | t e s t d a t a [ j ] - t r a i n d a t a [ i ] | | 2 2
I=1,2 ... n is training sample number, j=1,2 ... 48 is test sample book number
4D. defines Map function and exports key-value pair <key, value>:key=testdata [j], value=distance [j] [i].
Step 5, realize Reduce process: be input as the key-value pair <key that Map function exports, value>, exports as load prediction results.Specific implementation step is:
5A. defines K value: K=constant, makes j=1;
5B. read test sample testdata [j]=x j=(x j1, x j2, x j3), distance [j] [i], i=1,2 ... n;
5C. carries out type conversion to the testdata [j] read with distance [j] [i] key-value pair;
Distance [j] [i] ascending order arranges by 5D., and before selecting, K minor increment is also designated as: d [j] [1], d [j] [2] ..., d [j] [K];
Before 5E. note, K the corresponding training sample of minor increment is traindata_k [s]=(x s1, x s2, x s3), L_k [s]=l s, s=1,2 ..., K;
5F. calculates the weight of K the point selected in 5D:
&omega; s ( t e s t d a t a &lsqb; j &rsqb; ) = 1 d &lsqb; j &rsqb; &lsqb; s &rsqb; , j = 1 , 2 , ... 48 , s = 1 , 2 , ... K
5G. determines eigenvalue matrix X, dependent variable matrix L, weight matrix W (x j):
X = 1 x 11 x 12 x 13 1 x 21 x 22 x 23 &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; 1 x K 1 x K 2 x K 3 , L _ k = l 1 l 2 &CenterDot; &CenterDot; &CenterDot; l K
W(x j)=diag[ω 1(x j),ω 2(x j),…ω K(x j)]
5H. adopts weighted least-squares method to try to achieve at independent variable x jlocate to obtain parameter of curve estimated value:
&alpha; ^ ( x j ) = &lsqb; &alpha; ^ 0 ( x j ) , &alpha; ^ 1 ( x j ) , &alpha; ^ 2 ( x j ) , &alpha; ^ 3 ( x j ) &rsqb; &prime; = &lsqb; X &prime; W ( x j ) X &rsqb; - 1 X &prime; W ( x j ) L _ k
5I. constructs local weighted equation of linear regression, at independent variable testdata [j]=x j=(x j1, x j2, x j3) place carries out load prediction:
L ^ j = x j &prime; &alpha; ^ ( x j ) = &alpha; ^ 0 ( x i ) + &alpha; ^ 1 ( x j ) x j 1 + &alpha; ^ 2 ( x j ) x j 2 + &alpha; ^ 3 ( x j ) x j 3
5J. calculates relative error:
E j = ( L ^ j - L j L j ) &times; 100 %
Wherein: for predicted load, L jfor load actual value.
5K. makes j=j+1, if j≤48, goes to step 5B, otherwise goes to step 6.
Step 6, parameter regulable control:
6A. calculates daily load accuracy rate:
A = ( 1 - 1 48 &Sigma; j = 1 48 E j 2 ) &times; 100 %
6B. revises K value: make K=constant+ Δ K, Δ K is self-defined step-length, if n is training sample number, performs the 5B to 5K in step 5; Otherwise go to step 6C;
The K value of 6C. selection corresponding to the highest daily load prediction accuracy rate A is the KNN respective value of final local weighted linear regression algorithm, obtains final local weighted regression curve equation, in order to load prediction.
(3) result visualization module
What predicting platform predicted the outcome specifically represents layer, for showing load prediction results.
(4) user management module
Password Management, user's fabric anomaly and subscriber management function are provided.
Compared with prior art, the invention has the beneficial effects as follows:
1. the data scale of traditional load prognoses system process is less, cannot process when data volume increases severely, and this predicting platform can process magnanimity electricity consumption data; 2. traditional load prognoses system majority uses relevant database, can only structured data, and this predicting platform can store and process the various structures categorical data affecting load variations, for load prediction provides more fully data analysis basic; 3. traditional load predictive system speed is lower, and this predicting platform realizes parallelization load prediction, improves predetermined speed, is beneficial to and realizes Real-time Load prediction; 4. this load prediction platform has extensibility, with Hadoop cluster for framework, conveniently can increase node, stores and strengthens with computing power.
Accompanying drawing explanation
The short-term load forecasting platform figure based on large data that Fig. 1 is is framework with Hadoop cluster.
Fig. 2 is local weighted linear regression short-term load forecasting process flow diagram.
Fig. 3 is the Map function process flow diagram of local weighted linear regression Mapreduceization.
Fig. 4 is the Reduce function process flow diagram of local weighted linear regression Mapreduceization.
Embodiment
Below in conjunction with accompanying drawing, the invention will be further described, but should not limit the scope of the invention with this.
The short-term load forecasting platform figure based on large data that Fig. 1 is is framework with Hadoop cluster, it is made up of Data Integration layer, load prediction layer, result visualization layer and user management, and wherein Data Integration layer comprises Data import, data storage, multistage combined index, data processing module.Data Integration layer realizes the Completion flow chart of data processing that the data acquisition of being correlated with from load and loading, data are stored into last data processing, is that the load prediction that becomes more meticulous afterwards is established accurately, the data enriched are basic; Load prediction layer is the core of platform, by local weighted linear regression algorithm Mapreduceization is realized parallelization load prediction; Result visualization layer be platform predict the outcome specifically represent layer, real-time analysis Technique dynamic is presented; User management layer is the safe machine preparative layer of platform, and implementation platform runs safe, reliable, efficiently.
Fig. 2 is local weighted linear regression short-term load forecasting process flow diagram, is the concrete implementation step of load prediction layer, comprises the following steps 1 to step 6; Fig. 3 is the Map function process flow diagram of local weighted linear regression Mapreduceization, comprises the following steps step 3 and step 4; Fig. 4 is the Reduce function process flow diagram of local weighted linear regression Mapreduceization, comprises step 5 and step 6.
Step 1, choose experimental data eigenwert: theoretical and affect the factor correlation analysis of load according to load period analysis theories, similarity analysis, choose historical load data, history weather data, all attribute informations, prediction day data as experimental data, the eigenvalue affecting load be moment every day, daily mean temperature, week attribute.
Historical load data can gather by EMS system; History weather data can be collected by the weather bureau in prediction area.
Step 2, structure experiment sample training set, test set: apply polynary local weighted linear regression model (LRM) and carry out load prediction, the relation obtained between load l and influence factor x needs to construct training set and test set, the structure of training set, test set is carried out, structural form according to the eigenwert attribute affecting load chosen in step 1:
[moment x 1, temperature x 2, week attribute x 3]
Moment x 1: every day, every 30 minutes sampling load datas, in one day 48 moment, is expressed as follows totally:
Temperature x 2: daily mean temperature
Week attribute x 3: represent that Monday is to Sun. by numeral 1 to 7.
Prediction mode: load prediction a few days ago, training sample is the load data of every 30 minutes of every day of the first two years day to be predicted, and in order to predict the load of every 30 minutes of day to be predicted, namely test sample book is the data of every 30 minutes of day to be predicted.
Final training set: (x i1, x i2, x i3, l i), i=1,2 ..., n, to variable x i1, x i2, x i3, l ido n time to observe, obtain n training sample.
Final test set: (x j1, x j2, x j3), j=1,2 ..., 48, totally 48 load point to be predicted.
Step 3, experimental data load: Experiment Training collection and test set file are independently divided into several data blocks by Hadoop cluster, deposit in distributed file system HDFS.
Set up inlet flow object, the training set text in HDFS is input to obtain HDFS example in Stream, read training set text line by line, carry out classification conversion and be designated as: traindata [i]=(x i1, x i2, x i3), L [i]=l i, i=1,2 ..., n, n are training sample number.Test set text then inputs in Map process.
Step 4, realize Map process: the distance realized in KNN algorithm between all points to be predicted and all sample points calculates.Be input as test set text, be designated as testdata [j]=(x j1, x j2, x j3), j=1,2 ..., m.Specific implementation step is:
4A. defines Map function-output and type thereof;
Test set text is carried out type conversion by 4B.;
4C. calculates distance distance [j] [i] between a jth test sample book and i-th training sample::
d i s tan c e &lsqb; j &rsqb; &lsqb; i &rsqb; = | | t e s t d a t a &lsqb; j &rsqb; - t r a i n d a t a &lsqb; i &rsqb; | | 2 2
I=1,2 ... n is training sample number, j=1,2 ... 48 is test sample book number
4D. defines Map function and exports key-value pair <key, value>:key=testdata [j], value=distance [j] [i].
Step 5, realize Reduce process: be input as the key-value pair <key that Map function exports, value>, exports as load prediction results.Specific implementation step is:
5A. defines K value: K=constant, makes j=1;
5B. read test sample testdata [j]=x j=(x j1, x j2, x j3), distance [j] [i], i=1,2 ... n;
5C. carries out type conversion to the testdata [j] read with distance [j] [i] key-value pair;
Distance [j] [i] ascending order arranges by 5D., and before selecting, K minor increment is also designated as: d [j] [1], d [j] [2] ..., d [j] [K];
Before 5E. note, K the corresponding training sample of minor increment is traindata_k [s]=(x s1, x s2, x s3), L_k [s]=l s, s=1,2 ..., K;
5F. calculates the weight of K the point selected in 5D:
&omega; s ( t e s t d a t a &lsqb; j &rsqb; ) = 1 d &lsqb; j &rsqb; &lsqb; s &rsqb; , j = 1 , 2 , ... 48 , s = 1 , 2 , ... K
5G. determines eigenvalue matrix X, dependent variable matrix L, weight matrix W (x j):
X = 1 x 11 x 12 x 13 1 x 21 x 22 x 23 &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; 1 x K 1 x K 2 x K 3 , L _ k = l 1 l 2 &CenterDot; &CenterDot; &CenterDot; l K
W(x j)=diag[ω 1(x j),ω 2(x j),…ω K(x j)]
5H. adopts weighted least-squares method to try to achieve at independent variable x jlocate to obtain parameter of curve estimated value:
&alpha; ^ ( x j ) = &lsqb; &alpha; ^ 0 ( x j ) , &alpha; ^ 1 ( x j ) , &alpha; ^ 2 ( x j ) , &alpha; ^ 3 ( x j ) &rsqb; &prime; = &lsqb; X &prime; W ( x j ) X &rsqb; - 1 X &prime; W ( x j ) L _ k
5I. constructs local weighted equation of linear regression, at independent variable testdata [j]=x j=(x j1, x j2, x j3) place carries out load prediction:
L ^ j = x j &prime; &alpha; ^ ( x j ) = &alpha; ^ 0 ( x i ) + &alpha; ^ 1 ( x j ) x j 1 + &alpha; ^ 2 ( x j ) x j 2 + &alpha; ^ 3 ( x j ) x j 3
5J. calculates relative error:
E j = ( L ^ j - L j L j ) &times; 100 %
Wherein: for predicted load, L jfor load actual value.
5K. makes j=j+1, if j≤48, goes to step 5B, otherwise goes to step 6.
Step 6, parameter regulable control:
6A. calculates daily load accuracy rate:
A = ( 1 - 1 48 &Sigma; j = 1 48 E j 2 ) &times; 100 %
6B. revises K value: make K=constant+ Δ K, Δ K is self-defined step-length, n is training sample number, performs the 5B to 5K in step 5; Otherwise go to step 6C;
The K value of 6C. selection corresponding to the highest daily load prediction accuracy rate A is the KNN respective value of final local weighted linear regression algorithm, obtains final local weighted regression curve equation, in order to load prediction.
By with upper module and concrete implementation step, the short-term load forecasting platform based on large data that can to obtain with Hadoop cluster be framework, also can the prediction algorithm of change of load prediction interval further, can realize the prediction algorithm of other high precision, parallelization.

Claims (11)

1., based on a short-term load forecasting platform for large data, adopt Hadoop cluster to be framework, it is characterized in that: comprise Data Integration module, load prediction module, result visualization module,
Described Data Integration module, for gathering, store and processing the multi-source heterogeneous data relevant to load, and these multi-source heterogeneous data are merged and historical data correction, fill up and normalized;
Described load prediction module, according to load period analysis theories, similarity analysis theory and the factor correlation analysis affecting load, local weighted linear regression algorithm Mapreduceization is carried out load prediction, calculating relative error, daily load accuracy rate carry out adjustment algorithm parameter to realize parameter regulable control, obtain the local weighted linear regression algorithm of final optimal parameter, carry out load prediction and predicted the outcome;
Described result visualization module, for showing load prediction results.
2. short-term load forecasting platform according to claim 1, is characterized in that, also comprise user management module, and described user management module is used for providing Password Management, user's fabric anomaly and subscriber management function.
3. short-term load forecasting platform according to claim 1, is characterized in that, described Data Integration module comprises Data import module, data memory module, multistage combined index module, data processing module;
Described Data import module, by the multi-source heterogeneous Data import that collects in data memory module;
Described data memory module, for semi-structured, destructuring small documents and structural data complicated and changeable, uses the key-value of HBase to store, for the file of single three kinds of data types, is directly stored in HDFS file system;
Described multistage combined index module, finds required data for the quick search criterion by user;
Described data processing module, for realizing data prediction, for all multi-source heterogeneous data, selects multi-kernel function learning method to merge; For the missing data in historical load data, adopt interpolation completion data; For the abnormal data in historical load data, adopt in length and breadth method to correct abnormal data; Finally the data after all process are normalized.
The raw data of collection is loaded into data memory module by Data import module, the multi-source heterogeneous data that data memory module storage of collected arrives, multistage combined index module obtains user's request data by the multi-source heterogeneous data in user instruction search data memory module, data processing module process user's request data.
4. short-term load forecasting platform according to claim 1, is characterized in that, described multi-source heterogeneous data comprise structural data, semi-structured data and unstructured data.
5. short-term load forecasting platform according to claim 4, it is characterized in that, described structural data comprises historical load data, weather data, step price, tou power price, point load character electricity price, all attributes and holiday information and traffic data, described semi-structured data comprise electrovalence policy and economic population tables of data, described unstructured data comprise GIS data.
6. short-term load forecasting platform according to claim 3, it is characterized in that, in described multistage combined index module, one-level index adopts multidimensional R to set basic structure, according to non-traditional geographical classification rule, data object after cleaning is divided into multiple subspace, a node of the corresponding R tree of every sub spaces, nonleaf node deposits the minimum enclosed rectangle (MBR) of all subtrees of node, deposits the MBR that each spatial object is corresponding in leaf node; Secondary and following index adopt the clustering objects based on proximity density (LCF), realize the object after cluster divide according to the degree of closeness of relative density.
7. short-term load forecasting platform according to claim 1, is characterized in that, the concrete steps that described load prediction module carries out load prediction are as follows:
Step 1, choose experimental data eigenwert: theoretical and affect the factor correlation analysis of load according to load period analysis theories, similarity analysis, choose historical load data, history weather data, all attribute informations, prediction day data as experimental data, the eigenvalue affecting load be moment every day, daily mean temperature, week attribute.
Step 2, carry out the structure of experiment sample training set, test set according to the eigenwert attribute affecting load chosen in step 1, structural form:
[moment x 1, temperature x 2, week attribute x 3]
Moment x 1: every day, every 30 minutes sampling load datas, in one day 48 moment, is expressed as follows totally:
Temperature x 2: daily mean temperature
Week attribute x 3: represent that Monday is to Sun. by numeral 1 to 7.
Prediction mode: load prediction a few days ago, training sample is the load data of every 30 minutes of every day of the first two years day to be predicted, and in order to predict the load of every 30 minutes of day to be predicted, namely test sample book is the data of every 30 minutes of day to be predicted;
Final training set: (x i1, x i2, x i3, l i), i=1,2 ..., n, to variable x i1, x i2, x i3, l ido n time to observe, obtain n training sample;
Final test set: (x j1, x j2, x j3), j=1,2 ..., 48, totally 48 load point to be predicted;
Step 3, experimental data load: Experiment Training collection and test set file are independently divided into several data blocks by Hadoop cluster, deposit in distributed file system HDFS;
Set up inlet flow object, the training set text in HDFS is input to obtain HDFS example in Stream, read training set text line by line, carry out classification conversion and be designated as: traindata [i]=(x i1, x i2, x i3), L [i]=l i, i=1,2 ..., n, n are training sample number, and test set text then inputs in Map process;
Step 4, realize Map process: the distance realized in KNN algorithm between all points to be predicted and all sample points calculates, and is input as test set text, is designated as testdata [j]=(x j1, x j2, x j3), j=1,2 ..., m;
Step 5, realize Reduce process: be input as the key-value pair <key that Map function exports, value>, exports as load prediction results;
Step 6, realize parameter regulable control, obtain final local weighted regression curve equation, in order to load prediction.
8. short-term load forecasting platform according to claim 7, is characterized in that, described step 4 realizes Map process, and concrete steps are:
4A. defines Map function-output and type thereof;
Test set text is carried out type conversion by 4B.;
4C. calculates distance distance [j] [i] between a jth test sample book and i-th training sample::
d i s tan c e &lsqb; j &rsqb; &lsqb; i &rsqb; = | | t e s t d a t a &lsqb; j &rsqb; - t r a i n d a t a &lsqb; i &rsqb; | | 2 2
I=1,2 ... n is training sample number, j=1,2 ... 48 is test sample book number
4D. defines Map function and exports key-value pair <key, value>:key=testdata [j], value=distance [j] [i].
9. short-term load forecasting platform according to claim 7, is characterized in that, described step 5 realizes Reduce process, and concrete steps are:
5A. defines K value: K=constant, makes j=1;
5B. read test sample testdata [j]=x j=(x j1, x j2, x j3), distance [j] [i], i=1,2 ... n;
5C. carries out type conversion to the testdata [j] read with distance [j] [i] key-value pair;
Distance [j] [i] ascending order arranges by 5D., and before selecting, K minor increment is also designated as: d [j] [1], d [j] [2] ..., d [j] [K];
Before 5E. note, K the corresponding training sample of minor increment is traindata_k [s]=(x s1, x s2, x s3), L_k [s]=l s, s=1,2 ..., K;
5F. calculates the weight of K the point selected in 5D:
&omega; S ( t e s t d a t a &lsqb; j &rsqb; ) = 1 d &lsqb; j &rsqb; &lsqb; s &rsqb; , j = 1 , 2 , ... 48 , s = 1 , 2 , ... K
5G. determines eigenvalue matrix X, dependent variable matrix L, weight matrix W (x j):
X = 1 x 11 x 12 x 13 1 x 21 x 22 x 23 . . . . . . . . . . . . 1 x K 1 x K 2 x K 3 , L _ k = l 1 l 2 . . . l K
W(x j)=diag[ω 1(x j),ω 2(x j),…ω K(x j)]
5H. adopts weighted least-squares method to try to achieve at independent variable x jlocate to obtain parameter of curve estimated value:
&alpha; ^ ( x j ) = &lsqb; &alpha; ^ 0 ( x j ) , &alpha; ^ 1 ( x j ) , &alpha; ^ 2 ( x j ) , &alpha; ^ 3 ( x j ) &rsqb; &prime; = &lsqb; X &prime; W ( x j ) X &rsqb; - 1 X &prime; W ( x j ) L _ k
5I. constructs local weighted equation of linear regression, at independent variable testdata [j]=x j=(x j1, x j2, x j3) place carries out load prediction:
L ^ j = x j &prime; &alpha; ^ ( x j ) = &alpha; ^ 0 ( x j ) + &alpha; ^ 1 ( x j ) x j 1 + &alpha; ^ 2 ( x j ) x j 2 + &alpha; ^ 3 ( x j ) x j 3
5J. calculates relative error:
E j = ( L ^ - L j L j ) &times; 100 %
Wherein: for predicted load, L jfor load actual value.
5K. makes j=j+1, if j≤48, goes to step 5B, otherwise goes to step 6.
10. short-term load forecasting platform according to claim 7, is characterized in that, described step 6 realizes parameter regulable control, and concrete steps are:
6A. calculates daily load accuracy rate:
5B to 5K; Otherwise go to step 6C;
The K value of 6C. selection corresponding to the highest daily load prediction accuracy rate A is the KNN respective value of final local weighted linear regression algorithm, obtains final local weighted regression curve equation, in order to load prediction.
CN201510628197.8A 2015-09-28 2015-09-28 Short-term load predication platform based on large data Active CN105205563B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510628197.8A CN105205563B (en) 2015-09-28 2015-09-28 Short-term load predication platform based on large data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510628197.8A CN105205563B (en) 2015-09-28 2015-09-28 Short-term load predication platform based on large data

Publications (2)

Publication Number Publication Date
CN105205563A true CN105205563A (en) 2015-12-30
CN105205563B CN105205563B (en) 2017-02-08

Family

ID=54953232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510628197.8A Active CN105205563B (en) 2015-09-28 2015-09-28 Short-term load predication platform based on large data

Country Status (1)

Country Link
CN (1) CN105205563B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105976069A (en) * 2016-05-30 2016-09-28 朱明增 Regionalism-based prediction system and method for short-term power load of grid region at Guigang
CN106845705A (en) * 2017-01-19 2017-06-13 国网山东省电力公司青岛供电公司 The Echo State Networks load forecasting model of subway power supply load prediction system
CN107423839A (en) * 2017-04-17 2017-12-01 湘潭大学 A kind of method of the intelligent building microgrid load prediction based on deep learning
CN107766452A (en) * 2017-09-26 2018-03-06 广西电网有限责任公司电力科学研究院 A kind of index structure and its indexing means of suitable information in power dispatching center zero access
CN107807961A (en) * 2017-10-10 2018-03-16 国网浙江省电力公司丽水供电公司 A kind of power distribution network big data multidomain treat-ment method based on Spark computing engines
CN107870376A (en) * 2016-09-28 2018-04-03 南京南瑞继保电气有限公司 A kind of multi-source meteorological data double sampling integration method
CN108492134A (en) * 2018-03-07 2018-09-04 国网四川省电力公司 The big data user power utilization behavior analysis system integrated based on multicycle regression tree
CN108549343A (en) * 2018-04-27 2018-09-18 湖南文理学院 A kind of kinetic control system and control method based on big data
CN108614071A (en) * 2018-03-21 2018-10-02 中国科学院自动化研究所 Distributed outside atmosphere quality-monitoring accuracy correction system and parameter updating method
CN108734355A (en) * 2018-05-24 2018-11-02 国网福建省电力有限公司 A kind of short-term electric load method of parallel prediction and system applied to power quality harnessed synthetically scene
CN108921232A (en) * 2018-07-31 2018-11-30 东北大学 A kind of hot-strip Cooling History data clusters and method for measuring similarity
WO2019056887A1 (en) * 2017-09-20 2019-03-28 国网上海市电力公司 Method for performing probabilistic modeling of large-scale renewable-energy data
WO2019127492A1 (en) * 2017-12-29 2019-07-04 华为技术有限公司 Node flow ratio prediction method and device
US10700523B2 (en) 2017-08-28 2020-06-30 General Electric Company System and method for distribution load forecasting in a power grid

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103345514A (en) * 2013-07-09 2013-10-09 焦点科技股份有限公司 Streamed data processing method in big data environment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103345514A (en) * 2013-07-09 2013-10-09 焦点科技股份有限公司 Streamed data processing method in big data environment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
蔡剑彪: "基于云计算的智能电网负荷预测平台研究", 《中国优秀硕士学位论文数据库》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105976069A (en) * 2016-05-30 2016-09-28 朱明增 Regionalism-based prediction system and method for short-term power load of grid region at Guigang
CN107870376A (en) * 2016-09-28 2018-04-03 南京南瑞继保电气有限公司 A kind of multi-source meteorological data double sampling integration method
CN106845705A (en) * 2017-01-19 2017-06-13 国网山东省电力公司青岛供电公司 The Echo State Networks load forecasting model of subway power supply load prediction system
CN107423839A (en) * 2017-04-17 2017-12-01 湘潭大学 A kind of method of the intelligent building microgrid load prediction based on deep learning
US10700523B2 (en) 2017-08-28 2020-06-30 General Electric Company System and method for distribution load forecasting in a power grid
WO2019056887A1 (en) * 2017-09-20 2019-03-28 国网上海市电力公司 Method for performing probabilistic modeling of large-scale renewable-energy data
CN107766452A (en) * 2017-09-26 2018-03-06 广西电网有限责任公司电力科学研究院 A kind of index structure and its indexing means of suitable information in power dispatching center zero access
CN107766452B (en) * 2017-09-26 2021-07-06 广西电网有限责任公司电力科学研究院 Indexing system suitable for high-speed access of power dispatching data and indexing method thereof
CN107807961A (en) * 2017-10-10 2018-03-16 国网浙江省电力公司丽水供电公司 A kind of power distribution network big data multidomain treat-ment method based on Spark computing engines
CN111527734A (en) * 2017-12-29 2020-08-11 华为技术有限公司 Node traffic ratio prediction method and device
WO2019127492A1 (en) * 2017-12-29 2019-07-04 华为技术有限公司 Node flow ratio prediction method and device
CN111527734B (en) * 2017-12-29 2021-10-26 华为技术有限公司 Node traffic ratio prediction method and device
CN108492134A (en) * 2018-03-07 2018-09-04 国网四川省电力公司 The big data user power utilization behavior analysis system integrated based on multicycle regression tree
CN108614071B (en) * 2018-03-21 2020-02-07 中国科学院自动化研究所 Distributed outdoor air quality monitoring precision correction system and parameter updating method
CN108614071A (en) * 2018-03-21 2018-10-02 中国科学院自动化研究所 Distributed outside atmosphere quality-monitoring accuracy correction system and parameter updating method
CN108549343A (en) * 2018-04-27 2018-09-18 湖南文理学院 A kind of kinetic control system and control method based on big data
CN108549343B (en) * 2018-04-27 2020-11-27 湖南文理学院 Motion control system and control method based on big data
CN108734355A (en) * 2018-05-24 2018-11-02 国网福建省电力有限公司 A kind of short-term electric load method of parallel prediction and system applied to power quality harnessed synthetically scene
CN108921232B (en) * 2018-07-31 2021-05-04 东北大学 Hot-rolled strip steel cooling historical data clustering and similarity measuring method
CN108921232A (en) * 2018-07-31 2018-11-30 东北大学 A kind of hot-strip Cooling History data clusters and method for measuring similarity

Also Published As

Publication number Publication date
CN105205563B (en) 2017-02-08

Similar Documents

Publication Publication Date Title
CN105205563B (en) Short-term load predication platform based on large data
Wen et al. Big data driven marine environment information forecasting: a time series prediction network
CN105184424B (en) Realize that the multi-kernel function of multi-source heterogeneous data fusion learns the Mapreduceization short-term load forecasting method of SVM
CN106651188A (en) Electric transmission and transformation device multi-source state assessment data processing method and application thereof
CN107563539A (en) Short-term and long-medium term power load forecasting method based on machine learning model
CN103295075B (en) A kind of ultra-short term load forecast and method for early warning
CN105844371A (en) Electricity customer short-term load demand forecasting method and device
CN109919370B (en) Power load prediction method and prediction device
CN108022001A (en) Short term probability density Forecasting Methodology based on PCA and quantile estimate forest
CN104037776A (en) Reactive power grid capacity configuration method for random inertia factor particle swarm optimization algorithm
Liu et al. An overview of decision tree applied to power systems
CN108539738A (en) A kind of short-term load forecasting method promoting decision tree based on gradient
CN108876019A (en) A kind of electro-load forecast method and system based on big data
CN106779219A (en) A kind of electricity demand forecasting method and system
Sun et al. Biobjective emergency logistics scheduling model based on uncertain traffic conditions
Qiao et al. Predicting building energy consumption based on meteorological data
Tao et al. Recurrent Neural Networks Application to Forecasting with Two Cases: Load and Pollution
Xie et al. Short-term power load forecasting model based on fuzzy neural network using improved decision tree
CN105678415A (en) Method for predicting net load of distributed power supply power distribution network
CN107134790A (en) A kind of GA for reactive power optimization control sequence based on big data determines method
CN108009668A (en) A kind of tune load forecasting method on a large scale using machine learning
CN109636010A (en) Provincial power network short-term load forecasting method and system based on correlative factor matrix
Gu et al. An fuzzy forecasting algorithm for short term electricity loads based on partial clustering
CN103886393A (en) Power grid investment optimization method based on simulation investment benefit analysis and learning automatons
Zhang et al. The power big data-based energy analysis for intelligent community in smart grid

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
GR01 Patent grant
C14 Grant of patent or utility model