CN115689001A - Short-term load prediction method based on pattern matching - Google Patents

Short-term load prediction method based on pattern matching Download PDF

Info

Publication number
CN115689001A
CN115689001A CN202211321149.0A CN202211321149A CN115689001A CN 115689001 A CN115689001 A CN 115689001A CN 202211321149 A CN202211321149 A CN 202211321149A CN 115689001 A CN115689001 A CN 115689001A
Authority
CN
China
Prior art keywords
load
prediction
clustering
month
monthly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211321149.0A
Other languages
Chinese (zh)
Inventor
唐志远
唐義坤
高毅
张梁
周进
陈月
迟福建
张桂婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Original Assignee
Sichuan University
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University, State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd filed Critical Sichuan University
Priority to CN202211321149.0A priority Critical patent/CN115689001A/en
Publication of CN115689001A publication Critical patent/CN115689001A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a short-term load forecasting method based on pattern matching, belonging to the technical field of load forecasting of an electric power system. And selecting a prediction model with the best verification effect from the multiple prediction models aiming at each type of load mode so as to improve the prediction precision of each type of load mode. The method realizes the self-adaptive matching of the user load and the load mode; the matching of the load mode and the optimal prediction model is realized, the advantages of each prediction model are fully exerted, and the overall prediction precision can be effectively improved.

Description

Short-term load prediction method based on pattern matching
Technical Field
The invention belongs to the technical field of load prediction of power systems, and particularly relates to a short-term load prediction method based on pattern matching.
Background
Load prediction is one of the key links for making power dispatching plans. Accurate load prediction helps the reasonable start-stop of arranging the unit of system, new forms of energy utilization's improvement, automatic power generation control and safe maintenance play important effect to the safety and stability of system, economic dispatch. With the continuous deepening of the construction of novel electric power systems in China and the establishment of a user side advanced measurement system, a large amount of user side electricity utilization data are accumulated in the process, but user clusters are dispersed in distribution, so that the electricity utilization amount is small, the load has large fluctuation and randomness, and the accurate prediction of the user cluster load is an important problem which needs to be solved.
In recent years, a lot of studies are made on short-term load prediction by many scholars at home and abroad, and a lot of prediction models and methods are proposed, wherein the prediction models and the methods can be roughly divided into two types, one type is a traditional prediction method represented by a time series method, and the other type is an artificial intelligence method based on data driving. The Auto Regression (AR) and Auto Regression Moving Average (ARMA) mathematical models are common models of the time series method, and these two models only use historical load data, and have small calculation amount, high speed, but no learning ability, and high requirement for data stationarity. In order to solve the problems of low prediction accuracy and low adaptability of the traditional prediction method, data-driven artificial intelligence methods are increasingly involved in load prediction, such as Artificial Neural Networks (ANN), support Vector Machines (SVM), random forests, deep belief networks, and the like. The existing method adopts a deep belief network to predict the load of a transformer substation, and obtains the optimal parameters of the network by using an adaptive matrix estimation algorithm, thereby realizing the self-adaptability of the network, but the calculation is more complex. In addition, the SVM prediction model is constructed after K-means clustering is carried out on the cells, so that space load prediction is realized, but the SVM prediction model is greatly influenced by kernel function correlation coefficients and penalty coefficients c. Still put forward a two-way gate return unit (BiGRU) short-term power load prediction method of Convolutional Neural Network (CNN) based on bayesian optimization. However, the neural network training is easy to generate the problem of 'overfitting', the generalization capability of the model is influenced, and the prediction precision is reduced. And clustering the historical samples by using a C-means fuzzy clustering algorithm, and then constructing a random forest regression model for the homogeneous data to predict. However, this method requires many decision trees to be created, and the space and time required for training are large.
However, as new energy power generation systems are popularized on the user side, the randomness and the fluctuation of the load of a user cluster are enhanced, and the generalization performance may be poor due to the randomness by applying the single prediction method.
Disclosure of Invention
The invention aims to provide a short-term load prediction method based on pattern matching, which is used for solving the technical problems in the prior art and realizing the self-adaptive matching of user load and load patterns; the matching of the load mode and the optimal prediction model is realized, the advantages of each prediction model are fully exerted, and the overall prediction precision can be effectively improved.
In order to achieve the purpose, the technical scheme of the invention is as follows:
the short-term load prediction method based on pattern matching comprises the following steps:
s1, optimally dividing historical load data of a user into a plurality of monthly load modes by adopting a hierarchical K-means clustering algorithm based on a Pearson correlation coefficient;
s2, aiming at the load mode of each month, selecting a mode matching algorithm with the best verification effect from a feedback neural network, a support vector machine regression and a linear regression;
and S3, according to the similarity between the load data of the latest month of the user and the month load mode, adaptively matching the power consumption data of the user and the month load mode, and adding prediction results of all the modes to obtain a total prediction result.
Further, step S1 is specifically as follows:
a clustering similarity measure function based on the Pearson correlation coefficient:
let x i =(x i1 ,x i2 ,…,x in ),x j =(x j1 ,x j2 ,…,x jn ) Two load curves are shown, and the Pearson correlation coefficient is as follows:
Figure BDA0003910500060000021
in the formula:
Figure BDA0003910500060000022
and
Figure BDA0003910500060000023
are respectively x i And x j The mean value of (a);
the principle of a hierarchical K-means clustering method is as follows:
repeatedly carrying out K-means clustering with the same K value for many times, and recording the clustering center of each time;
performing hierarchical clustering on the clustering centers under all records;
taking a clustering center obtained by hierarchical clustering as an initial clustering center, and performing K-means clustering again to obtain a final clustering result;
selecting a clustering index:
the evaluation indexes are as follows:
Figure BDA0003910500060000031
in the formula: k is the number of clusters, v c The cluster center of each cluster is obtained; n is K The number of samples contained in the current class K cluster.
Further, step S2 is specifically as follows:
BP neural network algorithm mechanism:
Figure BDA0003910500060000032
wherein: p is an input vector, b is a bias constant, w is a weight vector from an input signal to a neuron, f is an excitation function, and y is an output signal of the neuron; the output of the neuron is
y=f(wP+b) (14)
The BP neural network consists of an input layer, a hidden layer and an output layer, and neurons between the layers are in full connection;
the BP neural network algorithm consists of two parts: firstly, forward propagation of signals, secondly, calculating errors of output values and real values, reversely propagating the errors, and continuously correcting parameters of each neuron through an L-M algorithm; training again after the parameters are corrected until the training result meets the error requirement or the maximum training times;
and (3) supporting a regression algorithm mechanism of a vector machine:
mapping the input vector to a high-dimensional feature space through nonlinear mapping, and calculating to obtain a regression hyperplane so as to minimize the sum of the distances from all sample points in the set to the hyperplane; the non-linear mapping is also referred to as a kernel function; set the input sample to M = { (x) i ,y i ),i=1,2,…n},x i ∈R d ,y i The expression of the hyperplane function obtained by the epsilon R and the SVR is as follows:
f(x)=ωΦ(x)+b (15)
in the formula: phi (·) is a kernel function, b is a threshold, and omega is a weight vector;
the SVR penalty function is:
Figure BDA0003910500060000033
in the formula: epsilon is the distance between a sample point and a hyperplane, namely when the distance between the sample point and the hyperplane is less than or equal to epsilon, the loss is 0;
the linear regression algorithm mechanism:
linear regression is an analytical approach that uses a regression equation to model the relationship between one or more independent variables x and a dependent variable y; the model is as follows:
f(x)=w 0 +w 1 x 1 +w 2 x 2 +...+w n x n (17)
y=f(x)+δ (18)
in the formula: w is a weight coefficient, and delta is a residual error; the main objective of linear regression is to find the coefficients w such that M = { (x) for all sample points in the set i ,y i ) I =1,2, \8230n } is a distance from and closest to f (x); linear regression adopts a least square method to calculate an equation;
Figure BDA0003910500060000041
ω 1 ,...,ω n =(X T Y) -1 X T Y (20)
the load prediction calculation amount by utilizing linear regression is small, and the relation between the input characteristics and the prediction information is directly obtained.
Further, step S3 is specifically as follows:
inputting characteristic parameters:
predicting load P at time t t Inputting characteristic parameters with strong load correlation at the time t to the model; the input characteristic parameters are as follows:
x={P t-h ,P t-h-1 ,P t-h-2 ,P t-h+1 ,P t-h+2 ,P t-2h ,P t-2h-1 ,P t-2h+1 ,
P t-3h ,P t-7h ,weekday,hour}
in the formula, the first 7 are historical load data, h is the time of day for acquiring the load data, and weekday and hour are respectively represented by weekday and hour;
matching the power load modes based on the Pearson correlation coefficient:
setting a month for 28 calculation, when predicting the load of the q +1 (q =0,1, \8230; 27) th day of a month, selecting load data of 28 days before the predicted day for pattern matching, wherein the load data comprises data1 of q days before the month and data2 of 28-q days after the last month; in order to make the pattern matching time consistent, the data1 data are spliced into finished month load data3 of a user from the beginning of a month to the end of the month before the data2; calculating a Pearson correlation coefficient of each user data3 and each monthly load mode, and classifying each user into the monthly load mode with the most similar form according to the lowest value of the Pearson correlation coefficient;
total flow of load prediction:
(1) Preprocessing user load data;
(2) Taking the load of all users in one month as a sample, carrying out K-means clustering on the sample for multiple times based on the Pearson coefficient, calculating a clustering index V, and obtaining the optimal clustering number K b
(3) Selecting a cluster number of K b Performing hierarchical K-means clustering based on Pearson correlation coefficient to obtain K b Class monthly load, with clustering center of each class monthly loadAs a monthly load mode;
(4) Constructing BPNN, SVR and LR models for each month load mode; selecting a model with the best test effect to match with the monthly load mode;
(5) Matching the predicted monthly load of each user one month before the day with the monthly load with the most similar form;
(6) Summing the monthly loads of the users matched to the same mode, and calculating the matching degree of the monthly loads, namely a Pearson correlation coefficient;
(7) Load prediction is carried out by using the prediction model with the best monthly load mode, and the prediction results of each mode are added to obtain the final load prediction result;
(8) In (7), if the best prediction model of the monthly load pattern is BPNN, and the matching degree of the obtained user load data and the monthly load pattern is less than a certain threshold value
Figure BDA0003910500060000051
Using the BPNN prediction model, otherwise using the suboptimal prediction model of the monthly load model;
(9) Evaluating the user cluster prediction effect by using MAPE and RMSE;
Figure BDA0003910500060000052
Figure BDA0003910500060000053
in the formula: e is the actual value, o is the predicted value, and N is the predicted number.
Compared with the prior art, the invention has the following beneficial effects:
one of the beneficial effects of the scheme is that morphological characteristics of historical load data of the user are fully mined, load patterns with strong fluctuation regularity are clustered, and the load data of the user is matched with the load patterns according to the similarity of the latest load data of the user and the load patterns, so that the data with more consistent fluctuation regularity are automatically classified into one class. And selecting a prediction model with the best verification effect from the multiple prediction models aiming at each type of load mode so as to improve the prediction precision of each type of load mode. The method realizes the self-adaptive matching of the user load and the load mode; the matching of the load mode and the optimal prediction model is realized, the advantages of each prediction model are fully exerted, and the overall prediction precision can be effectively improved. And performing morphological clustering based on the Pearson correlation coefficient on the power load of the user, and adaptively matching the power load mode of the user. And adopting the optimal prediction model for each load mode to obtain the optimal prediction result. The method has the advantages that the prediction accuracy can be effectively improved by the mode adaptive matching-based algorithm.
Drawings
FIG. 1 is a comparison graph of the similarity of the morphology of 3 load curves according to an embodiment of the present invention.
FIG. 2 is a diagram of a neuron model according to an embodiment of the present invention.
Fig. 3 is a structure diagram of a three-layer BP network according to an embodiment of the present invention.
FIG. 4 is a schematic diagram of pattern matching according to an embodiment of the present invention.
FIG. 5 is a prediction flow chart of an embodiment of the present invention.
FIG. 6 is a clustering index plot in accordance with an embodiment of the present invention.
FIG. 7 is a monthly load pattern diagram in accordance with one embodiment of the present invention.
Fig. 8 is a diagram illustrating a prediction result of a user cluster according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to fig. 1 to 8 of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Example (b):
a short-term load prediction method based on pattern matching is provided. Based on the resident load data of 300 users with photovoltaic power generation systems, the morphological distance of the monthly load of the users is measured by using the Pearson correlation coefficient, and different monthly load modes are optimally divided by adopting a hierarchical K-means clustering method. And (2) carrying out feedback neural network (BPNN), support vector machine regression (SVR) and Linear Regression (LR) model training and verification aiming at each monthly load mode. When load prediction is carried out, all users are matched into a load mode with the nearest form distance according to the power utilization data of the users in the month before the prediction day, and prediction is carried out by using the prediction model with the best test effect. And finally, the average absolute percentage error (MAPE) and the Root Mean Square Error (RMSE) are used as error measurement, and the effectiveness of the provided user cluster prediction method is verified by comparing other prediction methods.
1. Partition and matching of monthly load pattern based on hierarchical K-means clustering algorithm
Hierarchical K-means clustering:
the main objective of the cluster analysis is to divide the samples with high similarity into one class and divide the samples with low similarity into different classes by comparing the similarity of all the samples. The common sample similarity measurement method is to normalize samples and calculate the euclidean distance between the normalized samples, but the normalization may cause data information compression and loss, and the measurement method based on the euclidean distance may calculate the overall difference between two samples due to the sum of squared distances, and ignore the difference in sample morphology, thereby resulting in an undesirable clustering effect. The K-means algorithm is used as a classical clustering algorithm and widely applied to power load clustering analysis, but the initial clustering center of the K-means algorithm is random, and the value of the clustering number K needs to be determined artificially, which may cause the clustering effect to have deviation. In order to solve the problems, a hierarchical K-means clustering algorithm based on the Pearson correlation coefficient is adopted.
A clustering similarity measure function based on the Pearson correlation coefficient:
the pearson coefficient is often used as a measure of similarity of curve morphology. Let x i =(x i1 ,x i2 ,…,x in ),x j =(x j1 ,x j2 ,…,x jn ) Two load curves are shown, and the Pearson correlation coefficient is as follows:
Figure BDA0003910500060000071
in the formula:
Figure BDA0003910500060000072
and
Figure BDA0003910500060000073
are respectively x i And x j Is measured. As can be seen from the formula, the sample mean value is subtracted from the sample during calculation, so that the influence of the curve amplitude on the similarity is eliminated. The more morphologically similar the two curves, the lower the pearson correlation coefficient, which is 0 when the two curves are morphologically identical. As shown in FIG. 1, three curves in the graph are respectively the electricity load curves Q of three users in a certain day 1 ,Q 2 ,Q 3 . Calculating Q 2 、Q 3 And Q 1 And the Euclidean distance d and the Pearson correlation coefficient r, d (Q) 1 ,Q 2 )=133.2,d(Q 2 ,Q 3 )=346.4,d(Q 1 ,Q 3 )=363.7,r(Q 1 ,Q 2 )=0.2524,r(Q 1 ,Q 3 )=0.2524,r(Q 2 ,Q 3 ) And =0. Using Euclidean distance calculations, Q 1 ,Q 2 More closely, but actually Q 2 ,Q 3 The morphology was completely consistent, and the calculated pearson coefficient was 0. In addition, cosine similarity is used as a morphological similarity measure, but the cosine similarity is affected by the curve amplitude. Therefore, the pearson correlation is selected as a measure of the similarity of the load curve morphology.
The principle of a hierarchical K-means clustering method is as follows:
the K-means algorithm is a classical partition algorithm, and the clustering result depends on the initial clustering center and the clustering number K. If the selection is not proper, the clustering effect is not ideal. The hierarchical clustering algorithm does not need to select an initial clustering center, the clustering effect is stable, but the computational complexity is high, and the hierarchical clustering algorithm is not suitable for large-scale data clustering. Therefore, the two are combined to establish a hierarchical K-means clustering algorithm, so that the calculation complexity can be reduced, and the clustering stability can be improved. The hierarchical K-means clustering algorithm comprises the following steps:
(1) Repeatedly carrying out K-means clustering with the same K value for many times, and recording the clustering center of each time;
(2) Performing hierarchical clustering on the clustering centers under all records;
(3) And taking the clustering center obtained by hierarchical clustering as an initial clustering center, and performing K-means clustering once again to obtain a final clustering result.
The algorithm can solve the problem of randomness of the initial clustering center of the K-means clusters, but the clustering number K still needs to be determined manually. The selection of the clustering index can determine the optimal clustering number.
Selecting a clustering index:
and selecting a proper clustering quality evaluation index, checking the clustering effectiveness, and determining the optimal clustering number K. The existing effectiveness index based on a graph constructs a clustering evaluation index of cosine similarity. The method is also suitable for clustering analysis based on the Pearson correlation coefficient. The evaluation indexes are as follows:
Figure BDA0003910500060000081
in the formula: k is the number of clusters, v c The cluster center of each cluster is obtained; n is a radical of an alkyl radical K The number of samples contained in the current class K cluster. The index represents the sum of the distances from each type of sample to the corresponding clustering center, V is reduced along with the increasing of the clustering number, and when the descending trend of the V value is about to be flat, the corresponding clustering number is the optimal clustering number.
2. Load prediction method and process
The load prediction method comprises the following steps:
different prediction methods have different prediction effects on different load modes, and in order to improve the prediction accuracy, an optimal prediction model suitable for the load mode is selected from the BPNN prediction method, the SVR prediction method and the LR prediction method.
BP neural network algorithm mechanism:
the artificial neural network has self-learning, self-organizing, self-adapting and strong nonlinear function approximation capability and strong fault tolerance. FIG. 2 is a neuron model structure.
In the figure: p is the input vector, b is the bias constant, w is the weight vector of the input signal to the neuron, f is the excitation function, and y is the output signal of the neuron. The output of the neuron is
y=f(wP+b) (25)
The BP neural network is a commonly used artificial neural network, and is composed of an input layer, a hidden layer and an output layer, wherein neurons between the layers are fully connected, and the structure of the BP neural network is shown in fig. 3.
The BP neural network algorithm mainly comprises two parts: the first is forward propagation of signals, the second is calculation of errors of output values and real values, the errors are propagated reversely, and parameters of each neuron are continuously corrected through an L-M (Levenberg-Marquardt) algorithm. And training again after the parameters are corrected until the training result meets the error requirement or the maximum training times. The BP neural network model has no specific mathematical expression, and is completely characterized by a network structure and parameters.
The regression algorithm mechanism of the support vector machine is as follows:
support vector regression is an important branch of application of Support Vector Machines (SVMs). The input vector is mapped to a high-dimensional feature space through nonlinear mapping, a regression hyperplane is obtained through calculation, and the sum of the distances from all sample points in a set to the hyperplane is the minimum. The non-linear mapping is also called kernel function, which is commonly used as a linear function, a polynomial function, a gaussian function, a radial basis function, etc. Set the input sample to M = { (x) i ,y i ),i=1,2,…n},x i ∈R d ,y i Hyperplane function expression obtained by epsilon R and SVRThe formula is as follows:
f(x)=ωΦ(x)+b (26)
in the formula: phi (·) is a kernel function, b is a threshold, and omega is a weight vector.
The SVR penalty function is:
Figure BDA0003910500060000091
in the formula: epsilon is the distance from the sample point to the hyperplane, namely when the distance from the sample point to the hyperplane is less than or equal to epsilon, the loss is 0. Many prior art techniques have been described for solving the model and are not described in detail here.
The linear regression algorithm mechanism is as follows:
linear Regression (LR) is an analytical approach that uses a regression equation (function) to model the relationship between one or more independent variables x and dependent variables y. The model is as follows:
f(x)=w 0 +w 1 x 1 +w 2 x 2 +...+w n x n (28)
y=f(x)+δ (29)
in the formula: w is the weight coefficient and δ is the residual error. The main objective of linear regression is to find the coefficients w such that M = { (x) for all sample points in the set i ,y i ) I =1,2, n } is the distance and closest to f (x). Linear regression generally uses a least squares method to solve its equation.
Figure BDA0003910500060000092
ω 1 ,...,ω n =(X T Y) -1 X T Y (31)
The load prediction by utilizing linear regression has small calculation amount, directly obtains the relation between the input characteristics and the prediction information, and has good interpretability.
3. Load prediction process based on pattern matching
Input characteristic parameter section:
when the load Pt at the t-th time is predicted, a characteristic parameter having a strong correlation with the load at the t-th time needs to be input to the model. Historical load, working day, weekend and the moment of every day being located have a great influence on the electric load, and the input characteristic parameters of selection are:
x={P t-h ,P t-h-1 ,P t-h-2 ,P t-h+1 ,P t-h+2, P t-2h ,P t-2h-1 ,P t-2h+1 ,
P t-3h ,P t-7h ,weekday,hour}
the first 7 in the formula are historical load data, h is the time of day for collecting the load data, and weekday and hour are respectively represented by weekday and hour.
And a power load pattern matching part based on the Pearson correlation coefficient:
when the load of the q +1 day (q =0,1, 82303027) of a month is predicted by setting 28 times of calculation in one month (the same is applied hereinafter), load data of 28 days before the predicted day needs to be selected for carrying out pattern matching, wherein the load data comprises data1 of q days before the month and data2 of 28-q days after the month. In order to make the pattern matching time consistent, the data1 data is moved to the front of the data2, and the data3 of the finished month load of the user from the beginning of the month to the end of the month is spliced. The Pearson correlation coefficient of each user data3 with each monthly load pattern is calculated, and each user is classified into the monthly load pattern with the most similar form at the lowest value. Pattern matching is shown in fig. 4.
Load prediction total flow section:
the general flow of the proposed user cluster prediction method based on pattern adaptive matching is shown in fig. 5, and the specific steps are as follows:
(1) And preprocessing the user load data.
(2) Taking the load of all users in one month (calculated according to 28 days) as a sample, carrying out K-means clustering on the sample for multiple times based on the Pearson coefficient, calculating a clustering index V, and obtaining the optimal clustering number K b
(3) Selecting the number of clusters as K b Performing hierarchical K-means clustering based on Pearson correlation coefficient to obtain K b Class monthly load, in each class monthlyThe cluster center of the load is used as a monthly load pattern.
(4) BPNN, SVR and LR models are constructed for each monthly load pattern. And selecting the model with the best test effect to match with the monthly load pattern.
(5) The predicted monthly load of each user one month before the day is matched with the monthly load with the most similar form.
(6) And summing the monthly loads of the users matched with the same pattern, and calculating the matching degree of the monthly loads, namely the Pearson correlation coefficient.
(7) And performing load prediction by using the prediction model with the optimal monthly load mode, and adding the prediction results of each mode to obtain a final load prediction result.
(8) In step (7), if the best prediction model of the monthly load pattern is BPNN, and the matching degree of the obtained user load data and the monthly load pattern is less than a certain threshold value
Figure BDA0003910500060000113
(provide with
Figure BDA0003910500060000114
) And if not, using a suboptimal prediction model of the monthly load mode to solve the problems of poor generalization capability and poor prediction result caused by 'overfitting' during the training of the BPNN model.
(9) The user cluster prediction effect was evaluated using MAPE and RMSE.
Figure BDA0003910500060000111
Figure BDA0003910500060000112
In the formula: e is the actual value, o is the predicted value, and N is the predicted number.
Example analysis:
and selecting power grid side power consumption data of 300 Australian users with residential users of the photovoltaic power generation system for analysis. The data length is from 1/7/2010 to 30/6/2011. The temporal resolution was 30 minutes.
Monthly load division results:
and selecting the data of 300 residents in 10 months before the 1 st 7 th 2010 and 1 st, and totaling 3000 month load data samples. The clustering index obtained according to the above step (2) is shown in fig. 6.
As can be seen from the graph, the K value is between 9 and 12, the evaluation index V is gradually decreased, and K is selected b =9 is the optimal number of clusters. And then carrying out hierarchical K-means clustering to obtain 9-class monthly load modes. As shown in fig. 7.
As can be seen from the figure, the 9 types of monthly load mode fluctuation modes are different, and the 1,2,3,5,8 and 9 types of modes have stable daily load fluctuation and strong regularity. The class 4 mode has a tendency to increase in load magnitude per day. The load fluctuations of the category 6,7 mode are relatively random.
The monthly load mode has 1344 points in total, if the data of the monthly load mode is connected end to end, the head load of the monthly load can be predicted by using the data at the tail end of the monthly load, 1344 training samples can be constructed, 1344 samples are selected from the training samples uniformly and in a return mode to perform prediction training of a prediction model, about 63% of the samples are used for training, samples which are not extracted are left for testing, and the obtained testing effect is shown in the following table 1.
TABLE 1 different model test errors MAPE of various monthly load modes
Tab 1MAPE of different model test errors for various monthly load patterns
Figure BDA0003910500060000121
The bold marked font in the table is the test error of the best model suitable for the month load mode. For example: the type 1 monthly load mode has the best effect when the SVR prediction model is selected, and the test error is 0.087.
Subject setup and results analysis:
in order to verify the effectiveness of the proposed method, the following 4 schemes are set for load prediction.
In the first scheme, mode division is not carried out, and BPNN direct prediction is used.
And in the second scheme, mode division is not carried out, and SVR is used for direct prediction.
And in the third scheme, mode division is not carried out, and LR direct prediction is used.
And fourthly, carrying out mode division and matching, and predicting each mode according to the optimal prediction method.
The load prediction was performed 7 days before 11 months, and the prediction results obtained are shown in fig. 8 and table 2:
TABLE 2 prediction results of user clusters for different experimental protocols
Tab 2Consumer cluster prediction results of different experimental schemes
Figure BDA0003910500060000122
Comparing the above results, it is found that compared with other methods using a single model, the adaptive model matching prediction method can achieve the best prediction effect, and can improve the prediction accuracy by about 0.6%
The results of pattern matching on month 11, day 7 are shown in table 3 below:
TABLE 3 7 th day pattern matching results of month 11
Tab 3Pattern matching results on November 7
Figure BDA0003910500060000123
As can be seen from table 3, the smaller the number of users matched by the model is, the higher the matching degree is, that is, the lower the similarity between the user load data and the monthly load pattern is, which may deteriorate the prediction effect to some extent.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (4)

1. The short-term load prediction method based on pattern matching is characterized by comprising the following steps of:
s1, optimally dividing historical load data of a user into a plurality of monthly load modes by adopting a hierarchical K-means clustering algorithm based on a Pearson correlation coefficient;
s2, aiming at the load mode of each month, selecting a mode matching algorithm with the best verification effect from a feedback neural network, a support vector machine regression and a linear regression;
and S3, according to the similarity between the load data of the latest month of the user and the month load mode, adaptively matching the power consumption data of the user and the month load mode, and adding prediction results of all the modes to obtain a total prediction result.
2. The method for short-term load prediction based on pattern matching as claimed in claim 1, wherein step S1 is specifically as follows:
a clustering similarity measurement function based on the Pearson correlation coefficient:
let x i =(x i1 ,x i2 ,…,x in ),x j =(x j1 ,x j2 ,…,x jn ) Two load curves are shown, and the Pearson correlation coefficient is as follows:
Figure FDA0003910500050000011
in the formula:
Figure FDA0003910500050000012
and
Figure FDA0003910500050000013
are each x i And x j The mean value of (a);
the principle of a hierarchical K-means clustering method is as follows:
repeatedly carrying out K-means clustering with the same K value for multiple times, and recording the clustering center of each time;
performing hierarchical clustering on the clustering centers under all records;
taking a clustering center obtained by hierarchical clustering as an initial clustering center, and performing K-means clustering again to obtain a final clustering result;
selecting a clustering index:
the evaluation indexes are as follows:
Figure FDA0003910500050000014
in the formula: k is the number of clusters, v c A clustering center of each clustering cluster; n is a radical of an alkyl radical K The number of samples contained in the current class K cluster.
3. The method for predicting short-term load based on pattern matching as claimed in claim 2, wherein the step S2 is as follows:
the BP neural network algorithm mechanism formula is as follows:
Figure FDA0003910500050000021
wherein: p is an input vector, b is a bias constant, w is a weight vector from an input signal to a neuron, f is an excitation function, and y is an output signal of the neuron; the output of the neuron is
y=f(wP+b) (3)
The BP neural network consists of an input layer, a hidden layer and an output layer, and neurons between all the layers are in full connection;
the BP neural network algorithm consists of two parts: firstly, forward propagation of signals, secondly, calculating errors of output values and true values, reversely propagating the errors, and continuously correcting parameters of each neuron through an L-M algorithm; training again after the parameters are corrected until the training result meets the error requirement or the maximum training times;
the regression algorithm mechanism of the support vector machine is as follows:
mapping the input vector to a high-dimensional feature space through nonlinear mapping, and calculating to obtain a regression hyperplane so as to minimize the sum of the distances from all sample points in the set to the hyperplane; nonlinear mapping is also referred to as kernel function; set the input sample to M = { (x) i ,y i ),i=1,2,…n},x i ∈R d ,y i The expression of the hyperplane function obtained by the SVR belongs to the following formula:
f(x)=ωΦ(x)+b (4)
in the formula: phi (·) is a kernel function, b is a threshold, and omega is a weight vector;
the SVR penalty function is:
Figure FDA0003910500050000022
in the formula: epsilon is the distance between the tolerant sample point and the hyperplane, namely when the distance between the sample point and the hyperplane is less than or equal to epsilon, the loss is 0;
the linear regression algorithm mechanism is as follows:
linear regression is an analytical approach that uses a regression equation to model the relationship between one or more independent variables x and a dependent variable y; the model is as follows:
f(x)=w 0 +w 1 x 1 +w 2 x 2 +...+w n x n (6)
y=f(x)+δ (7)
in the formula: w is a weight coefficient, and delta is a residual error; the main objective of linear regression is to find the coefficients w such that M = { (x) for all sample points in the set i ,y i ) I =1,2, \8230n } is a distance from and closest to f (x); linear regression adopts a least square method to calculate an equation;
Figure FDA0003910500050000031
ω 1 ,...,ω n =(X T Y) -1 X T Y (9)
the load prediction calculation amount by utilizing linear regression is small, and the relation between the input characteristics and the prediction information is directly obtained.
4. The method of claim 3, wherein the step S3 is as follows:
inputting characteristic parameters:
predicting load P at time t t Inputting characteristic parameters with strong load correlation at the time t to the model; the input characteristic parameters are as follows:
x={P t-h ,P t-h-1 ,P t-h-2 ,P t-h+1 ,P t-h+2 ,P t-2h ,P t-2h-1 ,P t-2h+1 ,P t-3h ,P t-7h ,weekday,hour}
in the formula, the first 7 are historical load data, h is the time number of the load data collected in one day, and weekday and hour are respectively represented by weer and hour;
matching the power load mode based on the Pearson correlation coefficient:
setting a month for 28 calculation, when predicting the load of the q +1 (q =0,1, \8230; 27) th day of a month, selecting load data of 28 days before the predicted day for pattern matching, wherein the load data comprises data1 of q days before the month and data2 of 28-q days after the last month; in order to make the pattern matching time consistent, the data1 data are spliced into finished month load data3 of a user from the beginning of a month to the end of the month before the data2; calculating a Pearson correlation coefficient of each user data3 and each monthly load mode, and classifying each user into the monthly load mode with the most similar form according to the lowest value of the Pearson correlation coefficient;
total flow of load prediction:
(1) Preprocessing user load data;
(2) Taking the load of all users in one month as a sample, carrying out K-means clustering on the sample for multiple times based on the Pearson coefficient, calculating a clustering index V, and obtaining the optimal clustering number K b
(3) Selecting a cluster number of K b Go forward toPerforming hierarchical K-means clustering based on Pearson correlation coefficient to obtain K b The class-month load takes the clustering center of each class-month load as a month load mode;
(4) Constructing BPNN, SVR and LR models for each month load mode; selecting a model with the best test effect to match with the monthly load mode;
(5) Matching the predicted monthly load of each user one month before the day with the monthly load with the most similar form;
(6) Summing the monthly loads of the users matched with the same mode, and calculating the matching degree of the monthly loads, namely a Pearson correlation coefficient;
(7) Load prediction is carried out by the prediction model with the best monthly load mode, and the prediction results of each mode are added to obtain a final load prediction result;
(8) In (7), if the best prediction model of the monthly load pattern is BPNN, and the matching degree of the obtained user load data and the monthly load pattern is less than a certain threshold value
Figure FDA0003910500050000043
Using a BPNN prediction model, otherwise using a suboptimal prediction model of the monthly load mode;
(9) Evaluating the user cluster prediction effect by using MAPE and RMSE;
Figure FDA0003910500050000041
Figure FDA0003910500050000042
in the formula: e is the actual value, o is the predicted value, and N is the predicted number.
CN202211321149.0A 2022-10-26 2022-10-26 Short-term load prediction method based on pattern matching Pending CN115689001A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211321149.0A CN115689001A (en) 2022-10-26 2022-10-26 Short-term load prediction method based on pattern matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211321149.0A CN115689001A (en) 2022-10-26 2022-10-26 Short-term load prediction method based on pattern matching

Publications (1)

Publication Number Publication Date
CN115689001A true CN115689001A (en) 2023-02-03

Family

ID=85100194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211321149.0A Pending CN115689001A (en) 2022-10-26 2022-10-26 Short-term load prediction method based on pattern matching

Country Status (1)

Country Link
CN (1) CN115689001A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781685A (en) * 2022-03-17 2022-07-22 广西电网有限责任公司 Big user power load prediction method and system based on big data mining technology
CN117744747A (en) * 2024-01-24 2024-03-22 广州豪特节能环保科技股份有限公司 Building cold source operation load prediction method by utilizing artificial neural network algorithm
CN118312736A (en) * 2024-04-07 2024-07-09 广东机电职业技术学院 Power load prediction method and device based on self-adaptive sparse attention network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781685A (en) * 2022-03-17 2022-07-22 广西电网有限责任公司 Big user power load prediction method and system based on big data mining technology
CN114781685B (en) * 2022-03-17 2024-01-09 广西电网有限责任公司 Large user electricity load prediction method and system based on big data mining technology
CN117744747A (en) * 2024-01-24 2024-03-22 广州豪特节能环保科技股份有限公司 Building cold source operation load prediction method by utilizing artificial neural network algorithm
CN118312736A (en) * 2024-04-07 2024-07-09 广东机电职业技术学院 Power load prediction method and device based on self-adaptive sparse attention network

Similar Documents

Publication Publication Date Title
CN108846517B (en) Integration method for predicating quantile probabilistic short-term power load
CN111353656B (en) Steel enterprise oxygen load prediction method based on production plan
CN115689001A (en) Short-term load prediction method based on pattern matching
CN106600059A (en) Intelligent power grid short-term load predication method based on improved RBF neural network
CN109146121A (en) The power predicating method stopped in the case of limited production based on PSO-BP model
CN111260211A (en) Intelligent energy system evaluation method and device based on AHP-improved entropy weight method-TOPSIS
CN112329990A (en) User power load prediction method based on LSTM-BP neural network
CN111460001B (en) Power distribution network theoretical line loss rate evaluation method and system
CN110837915B (en) Low-voltage load point prediction and probability prediction method for power system based on hybrid integrated deep learning
CN110717610A (en) Wind power prediction method based on data mining
CN112529683A (en) Method and system for evaluating credit risk of customer based on CS-PNN
CN112163689A (en) Short-term load quantile probability prediction method based on depth Attention-LSTM
CN111832839B (en) Energy consumption prediction method based on sufficient incremental learning
Li et al. Short term prediction of photovoltaic power based on FCM and CG-DBN combination
CN115660855A (en) Stock closing price prediction method fusing news data
CN115879602A (en) Ultra-short-term photovoltaic output prediction method based on transient weather
CN112288157A (en) Wind power plant power prediction method based on fuzzy clustering and deep reinforcement learning
CN115358437A (en) Power supply load prediction method based on convolutional neural network
CN117151770A (en) Attention mechanism-based LSTM carbon price prediction method and system
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN109214610A (en) A kind of saturation Methods of electric load forecasting based on shot and long term Memory Neural Networks
CN117034762A (en) Composite model lithium battery life prediction method based on multi-algorithm weighted sum
CN116632834A (en) Short-term power load prediction method based on SSA-BiGRU-Attention
Wang et al. Short term load forecasting: A dynamic neural network based genetic algorithm optimization
CN112785022B (en) Method and system for excavating electric energy substitution potential

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination