CN112396234A

CN112396234A - User side load probability prediction method based on time domain convolutional neural network

Info

Publication number: CN112396234A
Application number: CN202011308724.4A
Authority: CN
Inventors: 原凯; 宋毅; 陈浩然; 张真源; 孙充勃; 潘尔生; 郭铭群; 吴志力; 姜世公; 李敬如; 杨卫红; 靳夏宁; 胡丹蕾
Original assignee: University of Electronic Science and Technology of China; State Grid Tianjin Electric Power Co Ltd; State Grid Economic and Technological Research Institute
Current assignee: University of Electronic Science and Technology of China; State Grid Tianjin Electric Power Co Ltd; State Grid Economic and Technological Research Institute
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2021-02-23

Abstract

The invention discloses a user side load probability prediction method based on a time domain convolutional neural network, which comprises the steps of firstly collecting historical load data of a user and weather types and temperature values in a corresponding time range, and then preprocessing the historical load data to meet the input requirement of the time domain convolutional neural network; then, a time domain convolution neural network is constructed to serve as a load probability prediction model, and the time domain convolution neural network can accurately predict the load probability through model training; and finally, predicting the load data acquired by the user in real time.

Description

User side load probability prediction method based on time domain convolutional neural network

Technical Field

The invention belongs to the technical field of power load prediction, and particularly relates to a user side load probability prediction method based on a time domain convolutional neural network.

Background

With the widespread use of advanced metering infrastructure in modern power systems, fine load curves and individual power usage behaviors at the customer end can be easily obtained. However, the explosive growth of data exceeds the capabilities of conventional load forecasting systems, and in particular, the vast amount of data, if collected, cannot be efficiently analyzed and utilized. In recent years, with the development of Artificial Intelligence (AI) technology, abundant fine-grained electricity consumption data can be processed, analyzed and utilized. Load forecasting for individual customers provides the opportunity for power companies to refine the management of the grid, thereby enabling better decisions on the power retail market.

In order to better analyze mass household electrical load data and obtain an accurate load prediction result, a plurality of load prediction methods based on machine learning are available at present, and the methods can be divided into two types according to the development stage of the machine learning, wherein the first type is a traditional machine learning method based on a support vector machine, an autoregressive integrated moving average (ARIMA), a clustering method (clustering-based approach) and the like. The second category is based on Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), and deep learning methods derived on this basis.

However, in the first method, the model often requires complex feature engineering from human experience, which not only brings a lot of workload, but also results in poor generalization of the prediction model. The second method is different from the traditional machine learning method, and the deep learning method has a highly flexible framework, can directly learn the characteristics from the original data, and can improve the prediction precision. However, the existing personal user short-term load prediction method based on RNN and CNN has model structural defects, and is difficult to deal with serious practical challenges: in the face of high-frequency model training and online updating of mass load data, the most basic requirement of a prediction model is as reliable as possible, and meanwhile, efficient calculation is carried out.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a user side load probability prediction method based on a time domain convolutional neural network, and the probability prediction of user load data is realized based on a load probability prediction model of the time domain convolutional neural network.

In order to achieve the above object, the present invention provides a method for predicting a load probability on a user side based on a time domain convolutional neural network, which is characterized by comprising the following steps:

(1) data acquisition and preprocessing

(1.1) collecting historical load, temperature and weather data

Active power of a user is collected as historical load data X according to a fixed period within a certain time range_loadSimultaneously collecting weather types X in corresponding time ranges_weatherAnd a temperature value X_temperatureThe weather and temperature data is then processed into data samples X ═ X (X) at the same frequency as the historical load data_weather,X_temperature,X_load)；

(1.2) data sample cleaning

Performing data cleaning on the data sample X, replacing an abnormal value in the data sample with a missing value, and then performing proximity interpolation processing at the missing value to obtain a complete data sample X' (X)_weather′,X_temperature′,X_load′)；

(1.3), Z-score normalization

And (5) carrying out standardization processing on the complete data sample to obtain a standard data sample X ″ (X ═ X)_weather″,X_temperature″,X_load"), the Z-score normalization formula is as follows:

wherein x is a standard value, x is a data sample value before processing, mu is an average value of the data sample, and sigma is a standard deviation of the data sample;

(1.4) addition time characteristics

Taking the time corresponding to the standard data sample, including the week and time characteristics, as a fourth characteristic X in the form of One-hot coding_dateAdded to X ", a standard data sample with time signature X" (X)_weather″,X_temperature″,X_load″,X_date)；

(1.5) construction of Standard data set

On X', a standard dataset was constructed using a sliding window method.

(1.5.1) generating the t frame of the standard data set by a sliding window method

A segment of data { X '″ of length n preceding time t is taken at X'_t-n,x″′_i-n+1,...,x″′_t-1And taking load data x' at the time t as a training sample of the model_loadAs a training label of the model, the training sample and the training label are correspondingly matched and used as one frame of the standard data set;

(1.5.2) repeating the step (1.5.1), and sequentially sliding backwards for M steps on the X' to construct a standard data set with the length of M frames;

(2) and constructing a load probability prediction model

Constructing a time domain convolution-based neural network as a load probability prediction model, wherein the time domain convolution neural network is formed by serially connecting an input layer, a plurality of residual blocks and an output layer, the output layer is formed by connecting three full-connection layers in parallel, and the number of quantiles corresponds to the number of the full-connection layers;

the ith residual block is provided with two input branches, wherein the left input branch is the input of the output of the (i-1) th residual block to the ith residual block after the computation of the convolutional layer, the weight normalization, the ReLU activation function and the neuron immediate inactivation (dropout) respectively; the right input branch is to directly input the output of the i-1 th residual block to the i-th residual block after 1 × 1 convolution; wherein the convolution layer is a convolution kernel formed by a plurality of dilation causal convolutions;

(3) training load probability prediction model

(3.1) randomly selecting a frame of data from the standard data set, inputting the frame of data into a time domain convolutional neural network model, converting the frame of data into a tensor form through an input layer, and inputting the tensor form into the serially connected residual blocks;

(3.2) in the time domain convolution neural network model, the input tensor of the ith residual block is set as Z^(i-1)I ═ 1,2, …, K denotes the number of residual blocks; in the left branch of the ith residual block, the tensor Z^(i-1)Performing feature extraction through a convolution kernel formed by a plurality of dilation causal convolutions, and then sequentially performing batch normalization, layer normalization, ReLU activation function and neuron random inactivation to obtain the output tensor of the left branch

In the right branch of the i-th residual block, the tensor Z^(i-1)Convolving by 1 × 1 to make its output tensor

Matching the dimensionality of the output tensors of the left branch, and then adding the output tensors of the two branches to obtain the output of the ith residual block

(3.3) repeating the step (3.2) until the last residual block output Z^(K)Finally, Z is^(K)The outputs of the three full-connection layers connected in parallel are respectively recorded as

And

(3.4) calculating the upper quantile q at the time t by a Pinball quantile formula^thLower quantile (1-q)^thAnd 0.5^thLoss value L corresponding to quantile_t,q(u_t,q)，L_t,(1-q)(u_t,(1-q)) And L_t,0.5(u_t,0.5)

(3.4.1) setting the preset probability of the Pinball quantile as q;

(3.4.2) calculating the upper quantile q at time t^thLower quantile (1-q)^thAnd 0.5^thQuantile prediction error

And

y_tis the real load value;

(3.4.3) calculating the upper quantile q at the time t^thLower quantile (1-q)^thAnd 0.5^thLoss value L corresponding to quantile_t,q(u_t,q)，L_t,(1-q)(u_t,(1-q)) And L_t,0.5(u_t,0.5)；

(3.5) carrying out weighted average on the loss values;

wherein the content of the first and second substances,

respectively represent L_t,q(u_t,q)，L_t,(1-q)(u_t,(1-q)) And L_t,0.5(u_t,0.5) The weight of (c);

(3.6) repeating the steps (3.1) - (3.5), comparing the weighted average loss obtained by each round of training, executing a learning rate automatic attenuation strategy when the weighted average loss does not decrease in the P round of training, and executing an early stop strategy when the weighted average loss does not decrease in the 3 x P round of training, namely automatically stopping the training to obtain a trained load probability prediction model; otherwise, entering the step (3.7);

(3.7) updating the weight parameters in the time domain convolution neural network model in the negative gradient direction of the weighted average loss function by using a batch gradient descent algorithm according to the negative gradient direction, and then returning to the step (3.1);

(4) load probability real-time prediction

Real-time collection of load data of a user, and weather and temperature related to the load dataBuilding a frame of standard data according to the method in the step (1), and inputting the standard data into a trained load probability prediction model so as to output a load probability prediction interval

And deterministic prediction results

The invention aims to realize the following steps:

the invention relates to a user side load probability prediction method based on a time domain convolutional neural network, which comprises the steps of firstly collecting historical load data of a user and weather types and temperature values in a corresponding time range, and then preprocessing the historical load data to meet the input requirement of the time domain convolutional neural network; then, a time domain convolution neural network is constructed to serve as a load probability prediction model, and the time domain convolution neural network can accurately predict the load probability through model training; and finally, predicting the load data acquired by the user in real time.

Meanwhile, the user side load probability prediction method based on the time domain convolution neural network also has the following beneficial effects:

(1) the time domain convolution neural network model is used as the load probability prediction model, and the time domain convolution neural network model can give consideration to both calculation efficiency and prediction reliability;

(2) the method carries out weighted loss calculation on the output of the time domain convolution neural network model through a Pinball quantile function and a weighted average mode, thereby expanding a deterministic prediction model into a probabilistic prediction model;

(3) in the training process of the model, the early-stopping strategy and the learning rate automatic attenuation strategy can reduce the long-tail effect in the training process of the model.

Drawings

FIG. 1 is a flow chart of a user-side load probability prediction method based on a time domain convolutional neural network according to the present invention;

FIG. 2 is a schematic diagram of a residual block;

FIG. 3 is a schematic diagram of a Pinball quantile function;

FIG. 4 is a probability prediction curve for the method of the present invention along with several other methods;

fig. 5 shows probability prediction curves for different predetermined probability Prediction Intervals (PINC) according to the method of the invention.

Detailed Description

The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

Examples

FIG. 1 is a flow chart of a user-side load probability prediction method based on a time domain convolutional neural network.

In this embodiment, as shown in fig. 1, the method for predicting the load probability on the user side based on the time domain convolutional neural network of the present invention includes the following steps:

s1, data acquisition and preprocessing

S1.1, collecting historical load, temperature and weather data

Active power of a user is collected as historical load data X according to a fixed period within a certain time range_loadSimultaneously collecting weather types X in corresponding time ranges_weatherAnd a temperature value X_temperatureIn the present embodiment, the sampling period is 15 minutes, i.e. 96 points/day cross section; the weather and temperature data is then processed into data samples X ═ X (X) at the same frequency as the historical load data_weather,X_temperature,X_load)；

S1.2, cleaning data samples

S1.3, Z-score normalization

wherein x is a standard value, x is a data sample value before processing, mu is an average value of the data sample, and sigma is a standard deviation of the data sample; the standardization processing can unify the magnitude and dimension of data, so that the generalization capability of the model is enhanced;

s1.4, time of addition feature

S1.5, constructing a standard data set

On X', a standard dataset was constructed using a sliding window method.

S1.5.1 sliding window method for generating t frame of standard data set

A segment of data { X '″ of length n preceding time t is taken at X'_t-n,x″′_t-n+1,...,x″′_t-1And taking load data x' at the time t as a training sample of the model_loadAs a training label of the model, the training sample and the training label are correspondingly matched and used as one frame of the standard data set;

s1.5.2, repeating S1.5.1 steps, sliding backwards in sequence by M steps on the X' to construct a standard data set with the length of M frames;

s2, constructing a load probability prediction model

as shown in fig. 2, the ith residual block has two input branches, wherein the left input branch is the output of the (i-1) th residual block, which is input to the ith residual block after being respectively calculated by convolutional layer, weight normalization, ReLU activation function and neuron immediate deactivation (drop); the right input branch is to directly input the output of the i-1 th residual block to the i-th residual block after 1 × 1 convolution; wherein the convolution layer is a convolution kernel formed by a plurality of dilation causal convolutions;

s3 training load probability prediction model

S3.1, randomly selecting a frame of data from the standard data set, inputting the frame of data into a time domain convolution neural network model, converting the frame of data into a tensor form through an input layer, and inputting the tensor form into the serially connected residual blocks;

s3.2, in the time domain convolution neural network model, the input tensor of the ith residual block is set to be Z^(i-1)I ═ 1,2, …, K denotes the number of residual blocks; in the left branch of the ith residual block, the tensor Z^(i-1)Performing feature extraction through a convolution kernel formed by a plurality of dilation causal convolutions, and then sequentially performing batch normalization, layer normalization, ReLU activation function and neuron random inactivation to obtain the output tensor of the left branch

In this embodiment, the function f(s) of the dilated causal convolution is:

wherein the content of the first and second substances,

representing an input tensor; (i) denotes a convolution filter, i ═ 0,1,2, …, λ -1, λ being the size of the convolution filter; s-d · i denotes the range of each step of convolution over the element s in the input tensor, d being the expansion factor.

S3.3, repeating the step S3.2 until the last residual block outputs Z^(K)Finally, Z is^(K)The outputs of the three full-connection layers connected in parallel are respectively recorded as

And

s3.4, calculating the upper quantile q at the t moment through a Pinball quantile formula^thLower quantile (1-q)^thAnd 0.5^thLoss value L corresponding to quantile_t,q(u_t,q)，L_t,(1-q)(u_t,(1-q)) And L_t,0.5(u_t,0.5)；

In this embodiment, to predict the 90% probability load interval, 90% and 10% Pinball quantiles are provided, as shown in fig. 3, so as to obtain two corresponding load curves, and a middle area of the two load curves is the 90% probability load prediction interval. In addition, the output corresponding to 50% of Pinball quantiles can be increased, when the Pinball quantile is 50%, the output corresponding to the model is a deterministic prediction result, and the residual distribution of deterministic prediction can improve the performance of the probabilistic prediction model. During the model training process, the 50% Pinball quantile is taken as one of the loss functions due to the mechanism of back propagation, and the gradient of the model can be guided to be reduced towards the direction beneficial to increasing the probability prediction performance.

S3.4.1, setting the preset probability of Pinball quantiles as q;

s3.4.2, calculating the upper quantile q at the time t^thLower quantile (1-q)^thAnd 0.5^thQuantile prediction error

And

y_tis the real load value;

s3.4.3, calculating the upper quantile q at the time t^thLower quantile (1-q)^thAnd 0.5^thLoss value L corresponding to quantile_t,q(u_t,q)，L_t,(1-q)(u_t,(1-q)) And L_t,0.5(u_t,0.5)；

S3.5, carrying out weighted average on the loss values;

wherein the content of the first and second substances,

s3.6, repeating the steps S3.1-S3.5, comparing the weighted average loss obtained by each round of training, executing a learning rate automatic attenuation strategy when the weighted average loss does not decrease in the P round of training, and executing an early stop strategy when the weighted average loss does not decrease in the 3 x P round of training, namely automatically stopping the training to obtain a load probability prediction model after the training is finished; otherwise, entering step S3.7;

the purpose of adopting the learning rate automatic attenuation strategy is as follows: when the learning rate is low during model training, the updating speed of the parameters is seriously reduced; when the learning rate is high, the model can vibrate in the gradient descending process, so that the model falls into local optimum;

the purpose of adopting the early-stop strategy is: the more training iterations of the model, the better the fitting effect of the model to a specific sample. However, an excessive number of iterations may result in model overfitting and excessive computational overhead. The early stopping strategy enables the training process to be automatically stopped under the condition that the loss of the model on the verification set is not reduced any more;

s3.7, updating weight parameters in the time domain convolution neural network model in the negative gradient direction of the weighted average loss function by using a batch gradient descent algorithm according to the negative gradient direction, and then returning to the step S3.1;

s4, load probability real-time prediction

Collecting load data of a user in real time and weather and temperature data related to the load data, constructing a frame of standard data according to the method in the step S1, and inputting the frame of standard data into a trained load probability prediction model so as to output a load probability prediction interval

And deterministic prediction results

In order to accurately and effectively evaluate the probability prediction interval, the PICP and PINAW indexes are used as evaluation criteria, and are respectively defined as follows:

in the formula: n is the number of check samples, and xi is the predicted target value if the predicted target value is included in the upper and lower limits of interval prediction _t1, otherwise ξ_t＝0；

Predicting a section for the load probability; r is an actual value change range and is used for carrying out normalization processing on the prediction interval;

table 1 compares the probability prediction results of the method of the present invention with those of other methods in the case of a 95% preset probability Prediction Interval (PINC), from which it can be concluded that, in the case of PICP approximating PINC, the performance of TCN is superior to CNN and LSTM, where the probability prediction interval width (PINAW) of LSTM is larger than TCN although smaller than CNN, thus demonstrating that the method of the present invention has better probability prediction performance. Fig. 4 is a corresponding probability prediction curve.

TABLE 1. probability prediction performance of different models

Table 2 compares the probability prediction results of the method of the present invention under different preset probability Prediction Intervals (PINC), and the Pinball quantile function is a non-parameter estimation method, and can conveniently output Prediction Intervals (PI) corresponding to different PINCs by adjusting quantiles. The probability predictions for different PINCs are shown in fig. 5. The width of the PI with 99% of the PINC is the largest, while the width of the PI with the lowest PINC is the smallest. Fig. 5 is a corresponding probability prediction curve.

TABLE 2 Performance of Pinball-TCN under different PINCs

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. A user side load probability prediction method based on a time domain convolution neural network is characterized by comprising the following steps:

(1) data acquisition and preprocessing

(1.1) collecting historical load, temperature and weather data

(1.2) data sample cleaning

(1.3), Z-score normalization

(1.4) addition time characteristics

(1.5) construction of Standard data set

On X', a standard dataset was constructed using a sliding window method.

(2) and constructing a load probability prediction model

(3) training load probability prediction model

And

(3.4.1) setting the preset probability of the Pinball quantile as q;

And

y_tis the real load value;

(3.5) carrying out weighted average on the loss values;

wherein the content of the first and second substances,

respectively represent L_t,q(u_t,q)，L_t,(1-q)(u_t,(1-q)) And L_t,0.5(u_t,0.5) The weight of (c).

(4) load probability real-time prediction

Collecting load data of a user in real time and weather and temperature data related to the load data, constructing a frame of standard data according to the method in the step (1), and inputting the frame of standard data into a trained load probability prediction model so as to output a load probability prediction interval

And deterministic prediction results

2. The method for predicting the load probability on the user side based on the time-domain convolutional neural network as claimed in claim 1, wherein the function f(s) of the dilation causal convolution is:

wherein the content of the first and second substances,

presentation inputA tensor; (i) denotes a convolution filter, i ═ 0,1,2, …, λ -1, λ being the size of the convolution filter; s-d · i denotes the range of each step of convolution over the element s in the input tensor, d being the expansion factor.