CN111080000A

CN111080000A - Ultra-short term bus load prediction method based on PSR-DBN

Info

Publication number: CN111080000A
Application number: CN201911239921.2A
Authority: CN
Inventors: 梅飞; 陆继翔; 陆进军; 石天; 周程; 潘益; 郑建勇
Original assignee: Southeast University; Hohai University HHU; NARI Group Corp
Current assignee: Southeast University; Hohai University HHU; NARI Group Corp
Priority date: 2019-12-06
Filing date: 2019-12-06
Publication date: 2020-04-28

Abstract

The invention discloses a PSR-DBN-based ultra-short term bus load prediction method, which comprises the following steps: (1) collecting historical bus load data, and carrying out range normalization processing on the load time sequence; (2) performing phase space reconstruction on the load time sequence, and solving the optimal embedding dimension and the optimal delay of the load time sequence by adopting a C-C method; (3) constructing a deep belief network, training the deep belief network by adopting a reconstructed load time sequence phase space matrix as a training set, and optimizing hyper-parameters of the deep belief network by adopting cross validation; (4) predicting the load value at the future moment by using the trained deep belief network; (5) and performing inverse normalization processing on the load predicted value returned by the deep belief network by using the maximum and minimum values of the load time sequence to obtain an actual load predicted value. The prediction method can still keep relatively high prediction precision under the conditions of high permeability of the distributed power supply and large bus load fluctuation.

Description

Ultra-short term bus load prediction method based on PSR-DBN

Technical Field

The invention relates to a power system operation scheduling technology, in particular to a PSR-DBN-based ultra-short-term bus load prediction method.

Background

At present, electric energy cannot be stored in a large quantity, and cost and investment recovery period of large-scale configuration of energy storage equipment are long, so that in order to guarantee safe operation of an electric power system and electric energy quality of an electric power user side, the generated energy and the power consumption in the electric power system are required to be equal in real time. The load prediction of the power system is an important means for maintaining such dynamic balance, and the load prediction of the power system is also significant for planning the power system and planning the power overhaul.

The load prediction of the power system can be generally divided into long-term prediction, medium-term prediction, short-term prediction and ultra-short-term prediction, wherein the short-term and ultra-short-term load prediction has important significance for economic dispatching, optimal power flow, power market transaction and the like. The higher the load prediction precision is, the more beneficial the utilization rate of the power generation equipment and the effectiveness of economic dispatching are improved, and the operation cost of the power grid is reduced.

To date, there has been systematic and highly productive research on traditional short-term and ultra-short-term load prediction for both domestic and foreign experts and scholars. Prediction methods can be mainly classified into two categories: the first type uses a conventional regression model, such as linear extrapolation, curve extrapolation, auto-regressive Moving average model (ARIMA) and other time series methods; the second category uses intelligent prediction models, such as Support Vector Regression (SVR), Artificial Neural Network (ANN), Deep Belief Network (DBN), and Long-Short Term Memory Network (LSTM). The methods achieve small prediction error and good robustness in the aspects of day-ahead load prediction and the like, however, most researches are carried out on load prediction of a system level, and relatively few researches are carried out on load prediction of a bus.

With the refinement and intellectualization of power grid optimized dispatching and the wide adoption of the high-grade application of the power grid considering the safety and the economy of a power system, the requirement on the load prediction precision of the bus is continuously improved. Because the load base number of the bus is much smaller than that of the system, the uncertainty of the bus load and the characteristics of multidimensional nonlinearity are more obvious, and the traditional method for distributing the predicted values of the system through the load ratio of each bus cannot achieve satisfactory effects.

In contrast, Jinshanhong et al adopt an ARIMA model to fit historical bus data and an SVR model to correct nonlinear residual errors, but because different bus load change laws have respective characteristics, a time series model with fixed parameters does not have good adaptability and can only be applied to individual scenes. Panapakidis et al propose a clustering and ANN-based bus load prediction model, perform day-ahead load prediction and hour-level load prediction, and obtain better prediction accuracy through verification.

The uncertainty and non-linearity characteristics of the bus load are further enhanced with the massive access of distributed energy resources. For example, the acquisition time interval of the data of the distributed photovoltaic power station is generally only 5min, historical information cannot be fully utilized in short-term load prediction and day-ahead load prediction of the bus in hours, the prediction precision is low, some complex models cannot meet the requirement of real-time performance, and more detailed ultra-short-term prediction is needed for real-time safety analysis and reliable operation of economic dispatching of a power system.

The Wangyfei et al provides a photovoltaic power generation prediction model of a chaos-Radial Basis Function (RBF) and verifies the prediction effect of the model under different weather conditions, but an author verifies the prediction precision of the model only under the condition of single-step prediction and does not relate to the problem of the prediction range of the model. Fan et al Propose (PSR) Phase space reconstruction algorithm and a quadratic Kernel (BSK) regression model, and obtain higher prediction accuracy on different data sets, but after Phase space reconstruction, different BSK models are used to perform independent regression prediction on each dimension data and ignore the time series characteristics of the load.

Therefore, the invention provides an ultra-short term bus load prediction method based on bus data and by using an advanced deep learning algorithm, which is an urgent problem to be solved.

Disclosure of Invention

The purpose of the invention is as follows: in order to solve the problems, the invention provides a PSR-DBN-based ultra-short-term bus load prediction method, which can still maintain higher prediction accuracy under the conditions of higher permeability of a distributed power supply and larger bus load fluctuation.

The technical scheme is as follows: in order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows: a PSR-DBN-based ultra-short term bus load prediction method comprises the following steps:

(1) collecting historical data of bus load, and setting a load time sequence x as { x ═ x₁,x₂…x_NCarrying out range normalization processing;

(2) carrying out phase space reconstruction on the load time sequence, and solving the optimal embedding dimension m of the load time sequence by adopting a C-C method_optAnd an optimum delay t_opt；

(3) Constructing a deep belief network, training the deep belief network by adopting a reconstructed load time sequence phase space matrix as a training set, and optimizing hyper-parameters of the deep belief network by adopting cross validation;

(4) predicting the load value at the future moment by using the trained deep belief network;

(5) and (4) performing inverse normalization processing on the load predicted value returned by the deep belief network in the step (4) by using the maximum and minimum values of the load time sequence in the step (1) to obtain an actual load predicted value.

Further, the load time series phase space matrix reconstructed in step 2 is:

wherein M ═ N- (M-1) t.

Further, the optimal embedding dimension m of the load time series is solved in the step 2 by adopting a C-C method_optAnd an optimum delay t_optThe method specifically comprises the following steps:

defining the associated integral:

when N is more than 3000, m is equal to {2,3,4,5}, r_kK × 0.5 σ, where σ is the standard deviation of the time series, and k ∈ {1,2,3,4 }.

Defining test statistics S and delta S and adopting a block average strategy, wherein the expression is as follows:

calculate the average of S and Δ S

And

wherein the content of the first and second substances,

first zero or

Rounding the t value of the first minimum point to the optimum delay t_opt。

Defining test statistics:

S_cor(t) the global minimum of which is the optimal embedding window t_ωBecause:

t_ω＝(m_opt-1)t_opt(6)

the optimum delay t, which can be determined by equation (4)_optAnd the optimum embedding window t determined by equation (5)_ωSubstitution of formula (6) and rounding to obtain the optimal embedding dimension m_opt。

Further, the step 3 of constructing the deep belief network comprises the steps of:

(1) constructing a DBN input layer, and inputting a row of elements of a load time sequence phase space matrix each time;

(2) constructing an output layer of the DBN, and outputting a predicted value of the load at the next moment;

(3) and constructing a DBN hidden layer, and optimizing the number of the hidden layer and the number of neurons in each hidden layer by adopting cross validation.

Further, the step 3 of optimizing the number of hidden layer numbers and the number of neurons in each hidden layer of the deep belief network by adopting cross validation includes the steps of:

(1) removing the last part in the training set as a verification set by adopting a cross-validation leaving method, and keeping the rest data as the training set;

(2) determining the optimal number of hidden layers by an enumeration method, fixing the number of neurons in each layer to be 2m, increasing the number of hidden layers layer by layer until obvious overfitting occurs, and then stopping, and selecting the number of hidden layers with the minimum prediction error;

(3) the method comprises the steps of roughly searching the number of better neurons by adopting a fixed step length, searching the number combination of the neurons which enables the prediction error to be minimum by taking m as the step length in each layer after determining the number of hidden layers, selecting the search range to be m-5m, and determining the better number combination of the neurons.

Has the advantages that: the invention provides a PSR-DBN-based ultra-short-term bus load prediction method aiming at the adaptability problem of a bus load prediction model prediction range, the time sequence characteristic of loads and the volatility problem caused by the large amount of distributed energy, the time sequence is projected into a moving point in a phase space by utilizing phase space reconstruction, and the track is fitted by utilizing the excellent nonlinear fitting capacity of a DBN network, so that the load prediction is realized.

Compared with the prior art, the prediction method can still keep relatively high prediction precision under the conditions of high permeability of the distributed power supply and large bus load fluctuation. Meanwhile, under different prediction ranges (5 minutes to 1 hour), the load prediction model provided by the invention still has smaller prediction error.

Drawings

FIG. 1 is a flow chart of the PSR-DBN-based ultra-short term bus load prediction method of the present invention;

FIG. 2 is a view of a RBM structure;

FIG. 3 is a diagram of a conventional DBN structure;

FIG. 4 is a diagram of a DBN structure of the present invention;

FIG. 5 is

And S_cor(t) curve;

FIG. 6 is a load prediction curve one hour ahead of day 15-18;

fig. 7 is a graph showing the variation of the prediction effect with the prediction range.

Detailed Description

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.

As shown in fig. 1, the method for predicting the load of the ultra-short term bus based on the PSR-DBN of the present invention includes the steps of:

(1) time series x ═ x for a list of bus load history data₁,x₂…x_NCarrying out range normalization processing on the load time sequence;

and performing range normalization processing on the load time sequence to facilitate training of the deep belief network, and storing the maximum and minimum values of the data for performing inverse normalization on the predicted value of the load later to recover the actual value.

(2) Performing phase space reconstruction on the load time sequence, and processing the load time sequence by adopting a C-C method to obtain the optimal embedding dimension m of the time sequence_optAnd an optimum delay t_opt；

Phase space reconstruction is an efficient method for analyzing nonlinear time series. The basic idea of phase space reconstruction is to regard the time sequence as a component generated by a determined nonlinear dynamic system, reconstruct an equivalent high-dimensional phase space of the dynamic system according to the change rule of the component, and project the time sequence into a moving point motion track with certain regularity in the high-dimensional phase space.

The reconstructed load time series phase space matrix is as follows:

wherein M ═ N- (M-1) t.

The key to the phase space reconstruction is to determine the optimal embedding dimension m_optAnd an optimum delay t_optThe invention adopts C-C method to simultaneously obtain the optimal embedding dimension m_optAnd a delay t_opt。

On the basis of equation (1), a correlation integral is defined:

according to the BDS statistical conclusion, when N is more than 3000, m and r can be obtained_kIs within the range of m ∈ {2,3,4,5}, r_kK × 0.5 σ, where σ is the standard deviation of the time series, and k ∈ {1,2,3,4 }.

calculate the average of S and Δ S

And

for the different time delays t of the time slots,

and

corresponding values can be found. Wherein the content of the first and second substances,

first zero or

Rounding the t value of the first minimum value point to obtain the optimal delay t_opt。

Defining test statistics:

t_ω＝(m_opt-1)t_opt(6)

the optimum delay t, which can be determined by equation (4)_optAnd the optimum embedding window t determined by equation (5)_ωSubstitution of the formula (6) and rounding to obtain the optimum embedding dimension m_opt。

(3) Constructing a Deep Belief Network (DBN), training the deep belief network by adopting a reconstructed load time sequence phase space matrix as a training set, and optimizing hyper-parameters of the deep belief network by adopting cross validation;

the DBN is formed by stacking a plurality of Restricted Boltzmann Machines (RBMs), and the DBN adopts a pre-training technology to solve network parameters by combining a Back Propagation (BP) algorithm, so that the DBN is not easy to fall into a local optimal solution, has higher convergence precision, has higher convergence speed when the number of model layers and the number of neurons in the layers are larger, and is more suitable for the fitting problem of a multidimensional nonlinear time sequence.

As shown in FIG. 2, the restricted Boltzmann machine is composed of a visible layer V composed of n neurons and a hidden layer H composed of m neurons, each neuron state quantity takes a value of 0 or 1 and obeys Bernoulli distribution, i.e., V_i∈{0,1}(i＝1,2…n)，h_jE {0,1} (j ═ 1,2 … m). There is no connection between neurons in each layer, and neurons between layers are connected by a weight ω.

The RBM is a probabilistic unsupervised learning, the network parameters of the RBM are composed of visible layer bias b, a weight matrix omega and hidden layer bias c, and the optimal values of the network parameters are determined through a minimum energy function. Wherein the expression of the energy function is defined as:

in the formula, ω_ijIs the connection weight of the ith visible layer neuron and the jth hidden layer neuron, and θ ═ b, ω, c }.

In combination with the energy function of equation (7), the joint probability distribution of the visible layer neuron state quantity and the hidden layer neuron state quantity is defined as:

in the formula, normalizing factor

Representing the sum of the negative exponentials of the energy functions under all possible values of the visible layer neuron state quantity v and the hidden layer neuron state quantity h.

The probability distribution p (v) of v can be derived from equation (8) as:

then, the objective function of RBM training can be represented as a likelihood function of the probability distribution of the visible layer state quantity v on the training set, and the likelihood function can be derived from equation (9):

where T represents the set of sample inputs on the training set, and the energy function is the minimum when the objective function takes the maximum.

V can be derived from the network structure of fig. 2_iActivation probability and h at given hidden layer neuron state h_jProbability of activation of neuron state v at a given visible layer:

in the formula, σ represents a sigmoid function.

Because the gradient value cannot be directly obtained when the stochastic gradient ascent algorithm is adopted for the formula (10), the gradient of the likelihood function is approximated by a Contrast Divergence (CD) algorithm in the training of the RBM, and the specific steps of the training of the RBM are as follows:

step 1: taking the samples in the training set as v₁Calculation of P (h) by substituting formula (12)₁＝1|v₁) And randomly sampling to obtain h₁Taking the value of (A);

step 2: h obtained in step 1₁Calculation of P (v) by substituting formula (11)₂＝1|h₁) And obtaining reconstructed v by random sampling₂A value of (d);

and step 3: v obtained in step 2₂Calculation of P (h) by substituting formula (12)₂＝1|v₂)；

And 4, step 4: updating the network parameters, wherein the iterative formula of the network parameters is as follows:

in the formula, epsilon is the learning rate, the value of the invention is 0.8, and the superscript k represents the kth iteration.

The traditional DBN is formed by stacking a plurality of RBMs, wherein a hidden layer of a previous RBM is used as a visible layer of a next RBM, a CD algorithm is adopted to determine network parameters layer by layer during pre-training, the process belongs to unsupervised learning, then the network parameters obtained by pre-training are assigned to a neural network to serve as training initial values of the network parameters, the network parameters are finely adjusted by combining sample labels in a training set with a BP algorithm, the process belongs to supervised learning, and the structure of the traditional DBN is shown in figure 3.

The invention adopts LM (Levenberg-Marquardt) BP algorithm to replace the traditional BP algorithm to carry out fine adjustment on the DBN. Compared with the traditional BP algorithm, the LMBP algorithm has higher convergence speed and higher convergence reliability, and is more suitable for training neural networks with more hidden layers.

Different from the traditional BP algorithm which adopts gradient descent, the LMBP algorithm is based on a Gaussian-Newton method in least square solution, takes the square of an error v as a target function, and conducts derivation after second-order Taylor expansion on the target function. After approximating the gradient of the objective function (ignoring higher order terms), the modifier expression for the weight ω is:

Δω＝-[J^T(ω)·J(ω)+μI]^-1J^T(ω)·v(ω) (14)

where μ is a correction coefficient for preventing J^T(ω) · J (ω) is irreversible; i is an identity matrix; j (ω) is the Jacobian matrix of v (ω), which can be written as:

similar to the BP algorithm, the weight ω in the kth iteration^(k+1)The correction formula is as follows:

ω^(k+1)＝ω^(k)+Δω^(k)(16)

mu needs to be correspondingly adjusted in each iteration to obtain a better convergence effect, when mu is very small, the Gaussian-Newton method with the standard algorithm has higher convergence precision, but if the difference between the target function and the quadratic function obtained by approximation is too large in the iteration, the convergence effect is poor; when mu is large, the algorithm is changed into gradient reduction in the traditional BP algorithm, and the solution can be assisted when the Gaussian-Newton method has poor convergence effect.

The DBN has a plurality of hyper-parameters to be set, the reasonable or not of the hyper-parameter adjustment determines the prediction accuracy of the prediction model, and the unreasonable hyper-parameters can cause the prediction error to be obviously increased. And the determination of the network structure is an important link for adjusting the DBN hyper-parameter.

For the input layer of the DBN, since the original load data is input into the DBN after passing through the PSR, the number of neurons in the input layer of the DBN does not need to be optimized, and the input layer can be directly set to m embedding dimensions, that is, one row of elements in equation (1) is input each time.

For the output layer of the DBN, when the DBN is input as a row of elements in equation (1), it is equivalent to input a position vector of a motion point at a certain time in a phase space, and at this time, it is necessary to output a predicted value of a position vector of the motion point at the next time.

In practice, if the input to the model is x in the phase space reconstruction matrix of equation (1)_i(i is more than or equal to 1 and less than or equal to M) then the position vector x of the next moment_i+1In (1) only x_i+1+(m-1)tIs unknown, so the output layer only needs to output the predicted value of the load at the moment i +1+ (m-1) t

If i +1 is larger than M, the phase space reconstruction matrix needs to be amplified downwards, and x is increased_i+1As a new row, x is again added_i+1As input to the model, x is found_i+2+(m-1)tThe predicted value of (2); then, the matrix is amplified and x is obtained_i+3+(m-1)tAnd so on until the prediction is finished. In the augmented matrix x_i+1The expression of (a) is:

x_i+1＝[x_i+1x_i+1+t…x_i+1+(m-1)t](17)

in the formula, x_i+1+(m-1)tThe value of (a) is determined by the prediction range, and if the prediction is single-step prediction, x is determined_i+1+(m-1)tTaking the true value of the load measured at the moment of i +1+ (m-1) t, and if the prediction is multi-step and the prediction step number is k, taking the true value

And replacing the predicted value in the augmentation matrix by the real value of the measured load in a unified manner until the kth step is predicted.

For the hidden layers of the DBN, the number of the hidden layer layers and the number of neurons in each hidden layer have obvious influence on a prediction result, and tests show that the influence of optimizing the number of the hidden layer layers is usually more obvious, so that the prediction effect is influenced by under-fitting due to too few hidden layer layers, and the prediction effect is deteriorated due to over-fitting caused by too many hidden layer layers.

Therefore, in order to improve the prediction accuracy of the PSR-DBN model, the invention adopts cross validation to optimize the number of hidden layers and the number of neurons in each hidden layer, and the method comprises the following specific steps:

(1) and (3) cross verification method: considering the characteristic that the load data is a time sequence, the sequence is not disturbed by adopting a K-fold cross validation method, so that the last part in the training set is removed as a validation set by adopting a retention method in the cross validation, and the rest data is retained to be continuously used as the training set;

(2) determining the optimal number of layers: determining the optimal number of hidden layers by using an enumeration method, wherein the number of each layer of neurons has small influence on prediction compared with the number of layers, so that the number of hidden layers is increased layer by fixing the number of each layer of neurons to be 2m during enumeration until obvious overfitting occurs, and then the number of hidden layers with the minimum prediction error is selected;

(3) determining the number of neurons per layer: because the prediction effect of the DBN has certain fluctuation along with the difference of the initial values, the influence of changing the number of the neurons on the prediction value one by one is easily submerged in the fluctuation of the prediction effect caused by the initial values, so the method adopts the fixed step length to roughly search the number of the better neurons. After the number of the hidden layers is determined according to the step (2), the number combination of the neurons which enables the prediction error to be minimum is searched in each layer by taking m as a step length, the training of the network becomes slow due to excessive neurons and the risk of overfitting exists, the search range is selected to be m-5m through testing, and the optimal number combination of the neurons is determined.

In summary, the structure diagram of the deep belief network of the present invention is shown in fig. 4, wherein the number n of hidden layers and the number of neurons in each hidden layer are obtained by cross validation.

(5) and (4) performing inverse normalization processing on the load predicted value returned by the deep belief network in the step (4) by using the maximum and minimum values stored in the step (1) to obtain an actual load predicted value.

The invention is further illustrated by the following examples.

In order to verify the effectiveness of the established model, the load data of a 220kV transformer substation bus 2017 of a certain city from 5 month 1 day to 5 month 18 day is adopted, the lower level of the bus is connected with a distributed photovoltaic power station, the installation capacity is about 50MW, and the sampling time interval of the load data is 5 minutes. During the period, the transformer substation has no overhaul or fault outage condition, the reliability of historical data is high, and abnormal data are detected by using a 3 sigma rule.

The method comprises the steps of selecting data of 1-14 days as a training set to carry out PSR-DBN prediction model modeling, and adjusting model parameters by adopting cross validation and a genetic algorithm; data from day 15 to day 18 were selected as predictive test samples. The prediction range of ultra-short-term load prediction adopted by the invention is from 5 minutes to 1 hour, and the proposed model is verified in the environment of MATLAB R2018 a.

In order to evaluate the quality of the model prediction effect and the prediction precision more intuitively and accurately, the invention adopts average Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) as evaluation indexes, wherein:

wherein n represents the number of prediction samples, p_iRepresenting the actual value of the payload at time i,

representing the predicted value of the payload at time i. Smaller values for MAPE and RMSE represent higher prediction accuracy for the model, but these two indices are relative values and need to be compared under the same data to make sense.

Based on PSR theoretical analysis, the invention adopts a C-C method to carry out phase space reconstruction on the bus load data of 1-14 days, and corresponding statistics

And S_corThe curve of (t) is shown in FIG. 5, and it can be seen that

Is t-18, and S_cor(t) is free ofThe obvious minimum point cannot result in the optimal embedding window t_ω。

However, according to the BDS statistical conclusion, when N is more than 3000, m is belonged to {2,3,4,5}, so that the maximum value of m can only take 5, and the final optimal embedding dimension m can be obtained according to the formula (6)_optOptimum delay t of 5 ═ 5_opt＝18。

Optimal embedding dimension m obtained at PSR_optAfter that, the DBN structure is determined to obtain a DBN with 5 neurons in the input layer and 1 neuron in the output layer. And eliminating 13-14 days in the training set as a verification set, and reserving the rest as the training set. The number of hidden layer neurons for fixing the DBN is 10, the number of layers is increased one by one, MAPE of a prediction result on a verification set is taken as a judgment standard, the prediction range is 1 hour, and MAPE of different hidden layer numbers are shown in a table 1:

TABLE 1

Number of layers	2	3	4	5	6	7	8
								MAPE	1.0387	1.0129	0.9358	1.0443	1.0413	1.0857	1.1336

In table 1, MAPE increases significantly when the hidden layer number equals 8, which presumably results in overfitting, thus stopping further increase in the layer number. When the hidden layer number is 4, the MAPE is minimum 0.9358, so the hidden layer number n is 4.

On the basis of 4 hidden layers, the number of better neurons is roughly searched by adopting fixed step length, the step length is set to be 5 neurons, MAPE predicted on a verification set is used as a judgment standard, and the prediction range is 1 hour. Sample space of 5 is searched⁴625, the number of neurons in each hidden layer was finally determined to be [25,15,20,15 ] when MAPE was minimal]At this time, the MAPE (0.8447) of the prediction result on the verification set is obviously better than the MAPE (0.9358) of 4 hidden layers with the neuron number of each layer being 10, so the neuron number of each hidden layer is [25,15,20,15 ] in the invention]。

In summary, the hyper-parameter settings of the DBN of the present invention are shown in table 2:

TABLE 2

Parameter name	Parameter value
		DBN network architecture	[5,25,15,20,15]
RBM learning rate	0.8
		Maximum number of training sessions of RBMNumber of	100
NN structure	[5,25,15,20,15,1]
		LMBP(μ)	0.001
Maximum number of NN trainings	150

In order to verify the prediction effect of the method provided by the invention, the ARIMA model, the NN model, the PSR-NN model and the DBN model are respectively adopted to predict the load of 15-18 days after being trained by using 1-14 calendar history data, the prediction range is 1 hour, and MAPE and RMSE values of the corresponding models are calculated. The ARIMA model adopts Akaike Information Criterion (AIC) to optimize and determine the order p of an autoregressive model (AR model) and the order q of a moving average model (MA model) in the ARIMA model, cross validation is also adopted for the NN model and the DBN model to carry out hyper-parameter tuning, and curves of load predicted values and actually measured load values corresponding to different models in 15-18 days are shown in FIG. 6.

The black solid line in fig. 6 is an actual measurement value, it can be seen that there is a large fluctuation in active power output of the photovoltaic power station, and the bus load curve is severely distorted from a general saddle shape, and shows an irregular fluctuation, and if a short-term load prediction is also used at this time, a large error is generated, and a large amount of known information is wasted. The other curves are respectively the predicted values of the ARIMA model, the NN model, the PSR-NN model and the DBN model, the prediction curves of the ARIMA model of the dotted line and the NN model of the dotted line obviously deviate from a black measured value, the prediction effects of other models cannot be judged directly through the curves, and MAPE and RMSE of the models are calculated to be shown in table 3 in order to more intuitively see the prediction effects of the models.

TABLE 3

Model (model)	MAPE(％)	RMSE
			PSR-DBN	0.9892	3.2316
PSR-NN	1.0125	3.2310
			DBN	1.1322	3.4684
ARIMA	1.5929	4.8657
			NN	1.6222	4.6757

In Table 3, the PSR-DBN model provided by the invention has the smallest MAPE and the second smallest RMSE in all 5 models, and the RMSE and the smallest PSR-NN model only differ by 0.0006, but the MAPE of the model provided by the invention is 0.02 higher than that of the PSR-NN model; the prediction effect of the DBN pre-trained by adopting the CD method is comprehensively superior to that of the common NN, and MAPE and RMSE of the prediction results of the PSR-DBN model and the PSR-NN model after the PSR link is added to reconstruct the original data are obviously improved compared with the DBN model and the NN model. Therefore, the method provided by the invention has higher prediction precision and ideal prediction effect in one-hour-ahead ultra-short-term prediction of the bus with higher distributed energy permeability and larger fluctuation.

In order to verify the adaptability of the PSR-DBN model on different prediction ranges, the invention still predicts the load of 15-18 days after training by respectively adopting the DBN model and the NN model and utilizing 1-14 calendar history data, the prediction ranges are 5 minutes-1 hour, and MAPE of the corresponding models are respectively calculated. The MAPE versus prediction range curves for the different models are shown in FIG. 7.

As can be seen in fig. 7, MAPE generally increases with the increase of the prediction range, the model provided by the present invention has higher prediction accuracy in most prediction ranges, and the prediction effect of the general NN is less than ideal; the PSR-DBN model has a significantly smaller MAPE than the DBN model alone when the prediction range is 5 minutes to half an hour and the PSR-DBN does not have much difference in prediction accuracy, but when the prediction range is further increased from half an hour, the advantages of the PSR reconstruction data begin to manifest. Therefore, the model provided by the invention can have smaller prediction error in a prediction range of 5 minutes to 1 hour, and has better adaptability on different prediction ranges.

Claims

1. A PSR-DBN-based ultra-short term bus load prediction method is characterized by comprising the following steps:

2. The PSR-DBN-based ultrashort term bus load prediction method of claim 1, wherein the load time series phase-space matrix reconstructed in the step 2 is:

wherein M ═ N- (M-1) t.

3. The PSR-DBN-based ultrashort term bus load prediction method of claim 2, wherein the optimal embedding dimension m of the load time series is solved in the step 2 by using a C-C method_optAnd an optimum delay t_optThe method specifically comprises the following steps:

defining the associated integral:

when N is more than 3000, m is equal to {2,3,4,5}, r_kK × 0.5 σ, where σ is the standard deviation of the time series, k ∈ {1,2,3,4 };

calculate the average of S and Δ S

And

wherein the content of the first and second substances,

first zero or

Rounding the t value of the first minimum point to the optimum delay t_opt；

Defining test statistics:

t_ω＝(m_opt-1)t_opt(6)

4. The PSR-DBN-based ultra-short term bus load prediction method as claimed in claim 3, wherein the step 3 of constructing the deep belief network comprises the steps of:

5. The PSR-DBN-based ultra-short term bus load prediction method as claimed in claim 4, wherein the step 3 of optimizing the number of hidden layers and the number of neurons in each hidden layer of the deep belief network by adopting cross validation comprises the steps of: