CN114358185A

CN114358185A - Improved K-means clustering CCA-BilSTM-based multi-dimensional short-term power load prediction method

Info

Publication number: CN114358185A
Application number: CN202210003822.XA
Authority: CN
Inventors: 李鑫; 李�昊; 杨桢; 李洪珠; 左辉; 马煜翔; 徐彤
Original assignee: Liaoning Technical University
Current assignee: Liaoning Technical University
Priority date: 2022-01-04
Filing date: 2022-01-04
Publication date: 2022-04-15

Abstract

The invention discloses a multi-dimensional short-term power load prediction method based on improved K-means clustering CCA-BilSTM, and belongs to the technical field of power load prediction. The method comprises the steps of firstly preprocessing historical load and multi-dimensional data, removing abnormal values and supplementing missing values monthly; then, initially determining K daily load characteristic labels, clustering historical load data by adopting a PCCs improved K-means algorithm, analyzing DBi indexes, and determining the K value of each daily load label and the characteristics of the corresponding load label w by combining the analysis result and engineering experience; constructing a preprocessed historical multidimensional data vector set, carrying out CCA contribution degree analysis on the preprocessed historical multidimensional data vector set and historical load data, and screening out 10 characteristic variables to reconstruct a characteristic data set; and completing the training of the BilSTM network by using the historical load data, the load labels and the reconstruction data set, and finally realizing the prediction of the short-term power load data in the future. By utilizing the short-term power load prediction method provided by the invention, the time redundancy of the prediction process can be reduced, the dimension of the required external variable is reduced, and the accuracy and the universality of the load prediction result are effectively enhanced.

Description

Improved K-means clustering CCA-BilSTM-based multi-dimensional short-term power load prediction method

Technical Field

The invention relates to the technical field of power load prediction, in particular to a multi-dimensional short-term power load prediction method based on improved K-means clustering CCA-BilSTM.

Background

The short-term power load prediction is used as an important component for guaranteeing the stable operation of a power system, and the accurate prediction is helpful for improving a supply side structure, helping a power grid to master the change trend of the load demand of a user in time, guiding a power supply party to make a more efficient and safe power scheduling strategy, and providing important reference for the construction of an intelligent power grid and the development of energy conservation and emission reduction work.

According to the difference of the adopted basic prediction methods, the current power load prediction method can be divided into 3 types based on a traditional mathematical model, a single intelligent algorithm and a combined structure. The prediction method based on the traditional mathematical model mainly adopts pure mathematical derivation and analyzes the load change trend through the characteristics of data, and the representative methods comprise Kalman filtering, exponential smoothing, wavelet analysis, linear regression analysis and the like. In the early stage, the method is widely applied due to the advantages of small operand and accurate prediction of simple linear loads, but the prediction accuracy and the adaptability of the method are greatly reduced due to the increase of nonlinear loads. The prediction method based on the single intelligent algorithm is developed mainly on the basis of artificial intelligent algorithms such as a shallow neural network algorithm, a support vector machine and the like, has certain improvement on the nonlinear data processing capacity compared with the traditional mathematical model method, can analyze multidimensional information to improve the prediction precision, and is easy to cause the problems of unstable constructed network, non-convergence of obtained results and the like because of the problems of insufficient model depth and weak generalization capacity. The load prediction method of the combined structure generally combines a plurality of algorithms with different advantages directly or in a weighting way, so that the overall performance advantage of the method is improved to meet the actual demand of short-term power load prediction. Generally, the precision of the combined structure method is higher than that of a single model, but in actual situations, the short-term power load is influenced by multi-dimensional parameters to influence the result of combined structure load prediction, so that final data mining is insufficient, and the accuracy of the final result is influenced. In view of the above, it is necessary to find a multi-dimensional short-term power load prediction method with high accuracy.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a multi-dimensional short-term power load prediction method based on improved K-means clustering CCA-BilSTM.

The technical scheme adopted by the invention is a multi-dimensional short-term power load prediction method based on improved K-means clustering CCA-BilSTM, the general flow is shown in figure 1, and the method comprises the following 9 steps.

Step 1: the historical daily load data and historical multidimensional information preprocessing can be divided into three types of data, namely ordinal type data, daily average type data and nominal type data according to data characteristics, the three types of original data can cause recording errors or data vacancy of the existing historical data due to recording equipment, recording means and the like, the existing historical data is preprocessed and corrected before data analysis, the flow is shown in figure 2, and the specific steps are as follows.

Step 1.1: distinguishing data types, wherein original data such as historical power loads, historical temperatures, historical wind power and the like which are expressed by continuous time functions are ordinal type data, original data obtained by averaging daily average temperatures, daily average wind power and the like on the basis of ordinal type are daily average type data, and discrete time or characteristic data such as seasons, months, weeks, holidays, working days and the like are nominal type data.

Step 1.2: performing preliminary division and bit supplement on ordinal number type historical data, and if the ordinal number type historical data has N days, dividing the ordinal number type historical data into vectors x by taking the days as a unit₁、x₂、…、x_NEach daily vector contains n time granularity elements, if some daily vector x exists_iAt a certain time granularity of element x_ijIf the data is missing, the day is determined as the missing day, and 0 supplement is adoptedThe simultaneous missing data is to be further processed subsequently.

Step 1.3: abnormal value detection is carried out on ordinal number type historical data, and elements under jth time granularity of the ith day are used as x_ijIndicates (i is 1 to N, j is 1 to N), and sets ordinal type history data x₁～x_NGrouping according to the monthly parts, and respectively calculating the average value mu of the elements under the granularity of n elements at the same time every day_jAnd standard deviation sigma_jThe calculation method is as follows:

wherein Ne is the number of days included in each month; using Lauda criterion, Ne x numbers of each month of ordinal type historical data are respectively judged_iEach element x in the vector_ijWhether the condition is satisfied.

x_ij∈[μ_j-3σ_j,μ_j+3σ_j] (2)

The probability that the value of an element satisfying the condition falls in this interval can reach 0.9973 theoretically if the element x_ijIf the element does not fall into the interval, the point is determined to be an abnormal value, the element value should be set to 0 to be regarded as a missing value, and the day vector is also regarded as a missing day.

Step 1.4: in order to ensure the continuity of the whole data, after abnormal values are removed, missing values in various historical data are subjected to bit complementing, and missing elements x in each missing day of the order number type variable are subjected to bit complementing_ijSupplementing data by adopting a modified Akima cubic Hermite interpolation method, and deleting an element x after supplementing_ijSatisfies the following conditions:

in the formula, x_(i-A)jThe value of the non-missing element with the shortest time at the granularity of the same time before the missing date is the value of the non-missing element with the difference of A days on two days; x is the number of_(i+B)jThe value of the non-missing element with the shortest time at the granularity of the same time after the missing date is the value of the non-missing element with the difference of B days between two days; x'_(i-A)jFor raw time series data at x_(i-A)jThe derivative value of (d); x'_(i+B)jFor raw time series data at x_(i+B)jThe derivative value of (c).

Then adopting linear difference value to delete element x of deletion day_ijLinear interpolation is carried out again to supplement the missing element x after the supplement_ijSatisfies the following conditions:

taking the mean value of the two interpolation results to finally obtain x_ijThe estimated values of (c) are:

daily average data loss is completed by manually recalculating the average value, and nominal data loss is completed by a mode in the same month.

Step 2: if the historical daily load has N days of load data, the historical daily load is set to have k daily load label numbers, the value of k is an integer from 1 to M, and the numerical value of M is manually set according to experience.

And step 3: and respectively performing M-time clustering on the historical daily load data of N days by adopting a K-means algorithm improved based on PCCs, and clustering each time into K classes according to the set daily load label number, wherein the specific steps are as follows.

Step 3.1: dividing the historical daily load data of N days into L₁、L₂、…、L_NN sample sets, labeled L_iIs a vector formed by the historical daily loads on the day i.

Step 3.2: generating k daily clustering center vectors according to the number of the daily load labels, wherein the labels of the clustering center vectors are c₁、c₂、…、c_k。

Step 3.3: respectively calculate L_i(i 1-N) PCCs distance d of sample set from k cluster centers_i1～d_ik：

In the formula (d)_ijIs L_iSample distance and jth cluster center c_jPCCs distance in between; rho_ijIs L_iSample vector and cluster center vector c_jThe correlation coefficient is in the range of [ -1,1]；cov(x_i,c_j) Is the covariance between the two vectors; sigma_LiAnd σ_cjAre respectively L_iVector sum c_jStandard deviation of the vector; n is L_iSample time particle size number; l is_izIs L_iZ-th element in the vector, c_jzIs c_jThe z-th element in the vector, wherein z is 1-n;

are respectively L_iVector sum c_jThe average of the vectors.

Step 3.4: comparison L_iThe distance of k PCCs in the sample set is obtained to obtain L_iShortest PCCs distance d of sample set_iNamely:

d_i＝mind_ij (7)

at this time d_ijCorresponding cluster center c_jI.e. distance sample L_iThe nearest cluster center is then considered as labeled L_iThe load data of day belongs to the jth load characteristic label, and by analogy, the load data of day of N days can be classified under k load characteristic labels respectively.

And 4, step 4: respectively evaluating the clustering effect of the historical daily load data classification results adopting different daily load characteristic labels k by adopting Davies-Bouldin (DBi) indexes, wherein the DBi index calculation method comprises the following steps:

in the formula, k is the clustering number; s_iFor all kinds of neutron elements to the clustering centerA value distance; r is_ijAdopting Euclidean distance as the center distance between the ith class and the jth class; i C_iI is the number of elements in the ith class; l is a historical load element set; and analyzing the DBi results of different daily load label numbers.

Although the smaller the DBi index is, the tighter the various objects and the class center is, the greater the separation degree between classes is, that is, the better the clustering effect is, in some cases, the DBi index cannot be analyzed according to the index size at one step, and the proper DBi index is selected as a standard to judge whether the clustering result meets the expected requirement or not by combining historical experience and actual conditions.

And finally determining the number W of the labels of the historical daily load data as W through DBi cluster analysis.

And 5: in order to facilitate the analysis of the multi-dimensional data correlation, a matrix [ X ] is constructed by utilizing the preprocessed multi-dimensional historical data₁,X₂,…]Each column vector represents a set of extrinsic influence variables.

Step 6: performing CCA analysis on each external influence variable column vector X in the constructed multidimensional data matrix and the preprocessed historical load data L respectively, and finally obtaining the contribution degree R of all external influence column vectors to the historical load data.

Step 6.1: all the X-column vectors and the historical load data L are normalized according to the following method:

in the formula, X_cIs the corresponding vector obtained after normalization of X vector, wherein X is the element in the X vector_minIs the minimum of the elements in the X vector, X_maxIs the maximum of the elements in the X vector.

Step 6.2: setting one-dimensional vectors obtained by projecting the normalized column vector Xc and the normalized historical load data Lc as X 'and L', wherein the relationship between the X 'and the L' is as follows:

X'＝α^TX_c,L'＝β^TL_c (9)

here, α and β are linear coefficients of two vectors, and only the direction and not the magnitude are considered.

To ensure that the solution results have a general constraint on α and β:

in the formula, S_XX、S_LLThe variances of X and L, respectively.

Step 6.3: defining a Lagrangian function:

in the formula, S_XLFor the covariance between vectors X and L, λ and θ are lagrange coefficients, the maximum of which is taken.

For J (α, β), the partial derivatives of α and β are calculated and the result is 0, respectively, and:

step 6.4: pair formula (12) is left-multiplied by alpha^TAnd beta^TAnd using the conditions of formula (10):

λ＝θ＝α^TS_XLβ (13)。

further finishing the mixture to obtain:

order to

H＝S_XX ^-1S_XLS_LL ^-1S_LX (15)

And carrying out singular value decomposition on the H to obtain the maximum singular value and left and right singular value vectors u and v corresponding to the maximum singular value.

Step 6.5: calculating linear coefficient vectors α and β:

x 'and L' are obtained with reference to formula (9).

Step 6.6: calculating the contribution degree R of the influence factor vector X 'to the historical load data vector L':

and 7: and sorting the contribution degrees R of all the influence factors X to the historical load data from high to low, ensuring that 10 influence variables with higher contribution degrees are comprehensively selected relatively, and reconstructing a characteristic data set O.

And 8: sending the reconstructed characteristic data set O, the label number W for determining the historical daily load data and the data set constructed by the historical load data L into a BilSTM network for training, wherein a BilSTM algorithm comprises a forward group of LSTM subunit networks and a reverse group of LSTM subunit networks, the input data are simultaneously processed by the LSTM networks in two directions, and the final two results are output after further superposition processing, the specific process of the BilSTM network training is shown in figure 3, wherein (a) is the internal process of each LSTM subunit, and (b) is the process of the whole BilSTM network, and the specific process of the training is as follows.

Step 8.1: input data x_tAn input forgetting gate combined with the output result h of the LSTM subunit in the last time period_t-1Screening and retaining the result processed by the previous memory unit, adjusting the state parameter of the LSTM unit, and finally forgetting the result f output by the gate_tComprises the following steps:

f_t＝σ(W_fx_t+U_fh_t-1+b_f) (18)

where σ is an activation function, a sigmoid function, W, is generally used_f、U_f、b_fAre network training parameters.

Step 8.2: input data x_tInput through the input gate, combined with the previous time periodOutput result h of the LSTM subunit_t-1To update the state C of the LSTM cell_tThe activation function adopted by the link is tanh, and the final result i output by the input gate_tComprises the following steps:

where σ is the activation function, sigmoid function is usually used, tanh is the activation function, W_i、U_i、b_i、W_c、U_c、 b_cAre network training parameters.

Step 8.3: according to the result i of the input gate_tControlling updated cell state C_tFinal states are used to update C_tSimultaneously with the result f of the forgetting gate_tFinally, the updated state C of the LSTM subunit is obtained_tComprises the following steps:

step 8.4: will input data x_tFed into the output gate, producing a result o of the output gate_t：

f_o＝σ(W_ox_t+U_oh_t-1+b_o) (21)

Where σ is an activation function, a sigmoid function, W, is generally used_o、U_o、b_oAre all network training parameters, usually b_oIs 1.

Step 8.5: results o_tState C updated by the LSTM cell_tControlling and outputting the final result h corresponding to the LSTM unit local data_tComprises the following steps:

step 8.6: the result h obtained by the forward and reverse two-layer LSTM subunits of the BilSTM is obtained_tAnd h'_tPerforming superposition operation, and finally outputting the result y of the BilSTM network_tComprises the following steps:

in the formula, a_tWeights output for the forward propagating LSTM subunit layer, b_tWeights output for back-propagating LSTM subunit layers, c_tThe parameters are optimized for the bias of the ensemble at the current time.

And step 9: and (3) sending the data label information of the time needing to be predicted and the 10 external factor information after the existing dimensionality reduction as input variables into the trained BilSTM network to finish the work of predicting the future power load data.

Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: the invention provides a multi-dimensional short-term power load forecasting method based on improved K-means clustering CCA-BilSt (binary-weighted average) algorithm, which is characterized in that clustering indexes of a conventional K-means algorithm are improved by PCCs (Primary control Cs) distance, load clustering analysis is carried out, DBi indexes are utilized for carrying out clustering evaluation, daily short-term power load data are labeled so as to display time variable characteristics, and the time redundancy of a forecasting result can be effectively reduced; evaluating the respective contribution degrees of the multi-dimensional external influence factors by using a CCA algorithm, screening a high-contribution-degree factor reconstruction feature set, and realizing the reduction of the dimension of the data set and the improvement of the algorithm operation efficiency; and performing bidirectional cyclic training on the BilSTM network by using the data set subjected to dimensionality reduction, introducing a time memory unit to further mine load data information in the past time, and improving the accuracy of a final load prediction result.

Drawings

FIG. 1 is a flow chart of a multi-dimensional short-term power load prediction method based on improved K-means clustering CCA-BilSTM.

FIG. 2 is a flow chart of the pretreatment of the present invention.

Fig. 3 is a diagram of the structure of the BiLSTM network of the present invention.

Wherein, (a) LSTM algorithm structure chart; (b) BilSTM algorithm structure chart.

Fig. 4 is a diagram of a DBi index cluster analysis result according to an embodiment of the present invention.

Fig. 5 shows the first 20 ranked results of the CCA analysis variable contribution according to an embodiment of the present invention.

FIG. 6 is a comparison graph of predicted results and actual results according to the embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

In this embodiment, taking the power load of a certain city from 11 months to 8 months of the next year in Jiangsu province in China as an example and mostly degree information data related to the power load as well as randomly selecting 2 days (20 days in total) from each month as a test set and other training sets, processing all training set data according to the first 8 steps of the overall flow shown in fig. 1 and training a model, preprocessing test set data according to the 1 st step shown in fig. 1, and entering the 9 th step to verify the method after the training set finishes the first 8 steps.

Step 1: the historical daily load data and historical multidimensional information preprocessing can be divided into three types of data, namely ordinal type data, daily average type data and nominal type data according to data characteristics, the three types of original data can cause recording errors or data vacancy of the existing historical data due to recording equipment, recording means and other reasons, the existing historical data is preprocessed and corrected before data analysis, the flow is shown as figure 2, and the specific steps are as follows:

step 1.1: the data type is distinguished, ordinal type data is mostly expressed by continuous time functions, daily average type data is obtained by daily averaging on the basis of ordinal type, discrete time or characteristic data is nominal type data, and therefore the corresponding relation between the original data of the embodiment and the ordinal type, the daily average type and the nominal type is shown in table 1.

TABLE 1

Type (B)	Raw data
		Ordinal number type	Historical power load, historical temperature, and historical wind power
Average daily pattern	Average daily temperature, average daily wind power, and average daily load
		Of the nominal type	Season, working day, weather, wind direction, month, week, holiday

Step 1.2: the ordinal number type historical data is divided into vectors x by taking the day as a unit when the ordinal number type historical data of a city in Jiangsu province has 304 days₁、x₂、…、x₃₀₄Each day vector contains 96 time granularity elements, if some day vector x exists_iAt a certain time granularity of element x_ijAnd if the data is missing, the day is determined as the missing day, and 0 is adopted to fill up the missing data for further subsequent processing.

Step 1.3: abnormal value detection is carried out on ordinal number type historical data, and elements under jth time granularity of the ith day are used as x_ijRepresents (i is 1 to 304, j is 1 to 96), and is ordinal type history data x₁～x₃₀₄Grouping according to the monthly parts, and respectively calculating the average value mu of elements under the granularity of 96 elements at the same time every day_jAnd standard deviation sigma_jThe calculation method is as follows:

in the formula, Ne is the number of days included in each month, and the historical data includes 10 months of historical data, wherein Ne values of 1 month, 3 months, 5 months, 7 months and 8 months are 31, Ne values of 4 months, 6 months, 9 months and 11 months are 30, and Ne value of 2 months is 28.

Using Lauda criterion, Ne x numbers of each month of ordinal type historical data are respectively judged_iEach element x in the vector_ijWhether the condition is satisfied:

x_ij∈[μ_j-3σ_j,μ_j+3σ_j] (2)

Step 2: the historical daily load training set has 284 days of load data, the highest number of k daily load labels is set, the value of k is an integer from 1 to M, and M is artificially set to be 8 in the embodiment according to a large number of existing documents and historical experiences.

And step 3: clustering is carried out on 284 days of historical daily load data for 8 times respectively by adopting a K-means algorithm improved based on PCCs, each clustering is carried out into K classes according to the set daily load label number, and the method specifically comprises the following steps:

step 3.1: dividing the historical daily load data of 284 days into L₁、L₂、…、L₂₈₄And 284 sample sets are provided, and each L sample set is a vector formed by historical daily loads corresponding to subscript numbers of the L sample sets.

Step 3.2: generating k daily clustering center vectors according to the number of the daily load labels, wherein the labels of the clustering center vectors are c₁、c₂、…、c_k；

Step 3.3: respectively calculate L_i(i-1-284) distance d of sample set from PCCs of k cluster centers_i1～d_ik：

In the formula (d)_ijIs L_iSample distance and jth cluster center c_jPCCs distance in between; rho_ijIs L_iSample vector and cluster center vector c_jThe correlation coefficient is in the range of [ -1,1]；cov(x_i,c_j) As co-square between two vectorsA difference; sigma_LiAnd σ_cjAre respectively L_iVector sum c_jStandard deviation of the vector; l, L_izIs L_iZ-th element in the vector, c_jzIs c_jThe z-th element in the vector, wherein z is 1-96; l is_i、c_jAre respectively L_iVector sum c_jThe average of the vectors.

d_i＝mind_ij (7)

at this time d_ijCorresponding cluster center c_jI.e. distance sample L_iThe nearest cluster center is then considered as labeled L_iThe load data of day belongs to the jth load characteristic label, and by analogy, the load data of day 284 can be classified under 1-8 load characteristic labels respectively.

in the formula, k is the clustering number; s_iThe mean distance from each neutron element to the clustering center; r is_ijAdopting Euclidean distance as the center distance between the ith class and the jth class; i C_iI is the number of elements in the ith class; l is a historical load element set; and analyzing the DBi results of different daily load label numbers.

The DBi index clustering analysis results with the label number of 1-8 are shown in FIG. 4.

As can be seen from the DBi cluster analysis result in fig. 4, the DBi index value is continuously increased along with the increase of the number of the daily load tags, when the number of the daily load tags is 3-4, the change of the DBi index is relatively smooth, and by combining with the existing partial research, the number W of the tags of the historical daily load data is finally determined to be 4, and the load characteristics corresponding to the tags 1-4 are shown in table 2.

Label number	Characteristic of daily load	Description of the invention
			1	High temperature day	Beginning of late 7 months to early 8 months
2	Spring and autumn working day	3-5 months and 9-11 months non-weekends
			3	Working days in summer and winter	Non-weekend non-high temperature day of 12-2 months and 6-8 months
4	Non-working day	Weekend and legal holidays

And 5: multiple dimensions for ease of analysisAccording to the correlation, a matrix [ Th ] is constructed by utilizing the preprocessed multidimensional historical data₁,Th₂,…, Th₂₃,Td₁,Td₂,…,Td₇,Wh₁,Wh₂,…,Wh₂₃,Wd₁,Wd₁,…,Wd₇,Dd₁,Dd₂,…,Dd₇,Hd₁, Hd₂,…,Hd₇,Lh₁,Lh₂,…,Lh₂₃,Ld₁,Ld₂,…,Ld₇]Each column vector represents a set of extrinsic influence variables, where Th_iHistorical air temperature, Td, i hours ago_iHistorical average air temperature, Wh, i days ago_iHistorical wind power i hours ago, Wd_iIs the historical average wind power Dd i days ago_iHistorical wind direction, Hd, i days ago_iHistorical weather i days ago, Lh_iHistorical load i hours ago, Ld_iMean load data for the i ephemeris history.

X'＝α^TX_c,L'＝β^TL_c (9)

To ensure that the solution results have a general constraint on α and β:

in the formula, S_XX、S_LLThe variances of X and L, respectively.

Step 6.3: defining a Lagrangian function:

λ＝θ＝α^TS_XLβ (13)。

further finishing the mixture to obtain:

order to

H＝S_XX ^-1S_XLS_LL ^-1S_LX (15)

Step 6.5: calculating linear coefficient vectors α and β:

x 'and L' are obtained with reference to formula (9).

The first 20 sequenced variables are shown in fig. 5, and in order to ensure the comprehensiveness of the information and consider the contribution degree R of the variables, the following 10 variables are finally selected to construct a feature data set O: lh₁～Lh₄、Lh₂₂～Lh₂₃、Th₁、Ld₁、Td₁、Wd₁。

And 8: sending the reconstructed characteristic data set O, the label number W for determining the historical daily load data and the data set constructed by the historical load data L into a BilSTM network for training, wherein a BilSTM algorithm comprises a forward group of LSTM subunit networks and a reverse group of LSTM subunit networks, the input data are processed by the LSTM networks in two directions simultaneously, and the final two results are output after further processing, the specific process of the BilSTM network training is shown in figure 3, wherein, the figure (a) is the internal process of each LSTM subunit, the figure (b) is the process of the whole BilSTM network, and the training process is as follows.

f_t＝σ(W_fx_t+U_fh_t-1+b_f) (18)

where σ is an activation function, a sigmoid function, W, is generally used_f、U_f、b_fAll are network training parameters.

Step 8.2: input data x_tInput through the input gate, and combine the output result h of the LSTM subunit in the previous time period_t-1To update the state C of the LSTM cell_tThe activation function adopted by the link is tanh, and the final result i output by the input gate_tComprises the following steps:

f_o＝σ(W_ox_t+U_oh_t-1+b_o) (21)

And step 9: and (3) sending the data label information of 20 test sets in the historical data and the 10 external factor information after corresponding dimensionality reduction as input variables into a trained BilSTM network to predict the power load data of the current day.

The predicted result of the data of 6 months is shown as an example, and a comparison graph of the predicted result and the real result is shown in fig. 6.

The 10-month prediction results obtained by the method were evaluated using the mean absolute error percentage (MAPE) and the Root Mean Square Error (RMSE) as evaluation indices, and the evaluation results are shown in table 3.

TABLE 3

Month of the year

MAPE

RMSR

Month of the year

MAPE

RMSR

11 month

1.24％

0.044

4 month

1.34％

0.051

12 month

1.32％

0.053

Month 5

1.23％

0.045

1 month

1.12％

0.045

6 month

1.05％

0.041

2 month

1.62％

0.058

7 month

1.47％

0.054

3 month

1.51％

0.049

8 month

1.39％

0.042

As can be seen from fig. 6 and table 3, the load prediction results MAPE of each month are less than 1.7% and the RMSR is less than 0.06 by using the improved K-means clustering CCA-BiLSTM-based multi-dimensional short-term power load prediction method.

The same group of data is compared with the performance of other existing partial power load prediction algorithms of single BP, single BilSTM, PSO-BilSTM and SSA-BilSTM, and the comparison result is shown in Table 4.

TABLE 4

Name of method	MAPE	RMSR
			BP	89.25％	2.587
BiLSTM	48.24％	1.258
			PSO-BiLSTM	16.21％	0.477
SSA-BiLSTM	9.27％	0.352
			Improved K-means clustering CCA-BilSTM	1.05％	0.041

As can be seen from the comparison of Table 4, the improved K-means clustering CCA-BilSTM short-term power load prediction method provided by the invention has a remarkable effect improvement compared with part of the existing power load prediction methods, and the prediction precision is obviously superior to that of BP and BilSTM methods with single structures and is also superior to that of PSO-BilSTM and SSA-BilSTM load prediction methods with part of optimized parameters.

Although the present invention has been described with reference to the preferred embodiments, it should be understood that various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A multi-dimensional short-term power load prediction method based on improved K-means clustering CCA-BilSTM is characterized by comprising the following 9 steps:

step 1: the historical daily load data and historical multi-dimensional information preprocessing can be divided into three types of data, namely ordinal type data, daily average type data and nominal type data according to data characteristics, the three types of original data can cause recording errors or data vacancy of the existing historical data due to recording equipment, recording means and the like, and preprocessing and correcting are carried out before data analysis.

And step 3: and respectively performing M-time clustering on the historical daily load data of N days by adopting a K-means algorithm improved based on PCCs, and clustering each time into K classes according to the set daily load label number.

And 4, step 4: and (3) respectively evaluating the clustering effect of the historical daily load data classification results adopting different daily load characteristic labels k by adopting Davies-Bouldin (DBi) indexes, wherein the smaller the DBi index is, the tighter each object and the class center is proved to be, the greater the separation degree between classes is, namely, the better the clustering effect is, but in some cases, analysis can not be carried out only according to the size of the indexes during analysis of the DBi index, and the proper DBi index is selected as a standard to judge whether the clustering result meets the expected requirement or not by combining historical experience and actual conditions. And finally determining the number W of the labels of the historical daily load data as W through DBi cluster analysis.

Step 6: and performing CCA analysis on each external influence variable column vector X in the constructed multidimensional data matrix and the preprocessed historical load data L respectively, and finally obtaining the contribution degree R of all external influence column vectors to the historical load data.

And 8: and sending the reconstructed characteristic data set O, the label number W for determining the historical daily load data and the data set constructed by the historical load data L into a BilSTM network for training, wherein a BilSTM algorithm consists of a forward LSTM subunit network and a reverse LSTM subunit network, input data are simultaneously processed by the LSTM networks in two directions, and finally two results are further processed and then output.

2. The method for predicting the short-term power load based on the improved K-means clustering CCA-BilSTM multi-dimension as claimed in claim 1, wherein the process of the step 1 is as follows:

Step 1.2: performing preliminary division and bit supplement on ordinal number type historical data, and if the ordinal number type historical data has N days, dividing the ordinal number type historical data into vectors x by taking the days as a unit₁、x₂、…、x_NEach daily vector contains n time granularity elements, if some daily vector x exists_iAt a certain time granularity of element x_ijAnd if the data is missing, the day is determined as the missing day, and 0 is adopted to fill up the missing data for further subsequent processing.

x_ij∈[μ_j-3σ_j,μ_j+3σ_j] (2)

The values of the elements theoretically satisfying the condition fall in this regionThe probability of the element x can reach 0.9973_ijIf the element does not fall into the interval, the point is determined to be an abnormal value, the element value should be set to 0 to be regarded as a missing value, and the day vector is also regarded as a missing day.

3. The method for predicting the short-term power load based on the improved K-means clustering CCA-BilSTM multi-dimension as claimed in claim 1, wherein the procedure of the step 3 is as follows:

are respectively L_iVector sum c_jThe average of the vectors.

d_i＝min d_ij (7)

4. The method for predicting the short-term power load based on the improved K-means clustering CCA-BilSTM multi-dimension as claimed in claim 1, wherein the procedure of the step 4 is as follows:

the DBi index calculation method is as follows:

5. The method for predicting the short-term power load based on the improved K-means clustering CCA-BilSTM multi-dimension as claimed in claim 1, wherein the procedure of the step 6 is as follows:

X'＝α^TX_c,L'＝β^TL_c (9)

To ensure that the solution results have a general constraint on α and β:

in the formula, S_XX、S_LLThe variances of X and L, respectively.

Step 6.3: defining a Lagrangian function:

λ＝θ＝α^TS_XLβ (13)

further finishing the mixture to obtain:

order to

H＝S_XX ^-1S_XLS_LL ^-1S_LX (15)

Step 6.5: calculating linear coefficient vectors α and β:

x 'and L' are obtained with reference to formula (9).

6. the method for predicting the short-term power load based on the improved K-means clustering CCA-BilSTM multi-dimension as claimed in claim 1, wherein the procedure of the step 8 is as follows:

f_t＝σ(W_fx_t+U_fh_t-1+b_f) (18)

Step 8.2: input data x_tInput through the input gate, and combine the output result h of the LSTM subunit in the previous time period_t-1To update the state C of the LSTM cell_tThe activation function adopted by the link is tan h, and the final input gate outputsResult of (i)_tComprises the following steps:

where σ is the activation function, sigmoid function is usually used, tanh is the activation function, W_i、U_i、b_i、W_c、U_c、b_cAre network training parameters.

f_o＝σ(W_ox_t+U_oh_t-1+b_o) (21)