CN114169416B - Short-term load prediction method based on migration learning under small sample set - Google Patents

Short-term load prediction method based on migration learning under small sample set Download PDF

Info

Publication number
CN114169416B
CN114169416B CN202111442332.1A CN202111442332A CN114169416B CN 114169416 B CN114169416 B CN 114169416B CN 202111442332 A CN202111442332 A CN 202111442332A CN 114169416 B CN114169416 B CN 114169416B
Authority
CN
China
Prior art keywords
data
load
residual block
training
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111442332.1A
Other languages
Chinese (zh)
Other versions
CN114169416A (en
Inventor
张真源
赵鹏飞
黄琦
胡维昊
易建波
李坚
井实
唐啸天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202111442332.1A priority Critical patent/CN114169416B/en
Publication of CN114169416A publication Critical patent/CN114169416A/en
Application granted granted Critical
Publication of CN114169416B publication Critical patent/CN114169416B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Water Supply & Treatment (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a short-term load prediction method under a small sample set based on transfer learning, which comprises the steps of firstly collecting historical load data and corresponding temperatures of a plurality of source domain users, so as to construct a plurality of input features; and training a plurality of depth residual error network models by using input features, carrying out adaptive migration combination on the depth residual error network models by using a Bayesian weighted probability averaging method, and carrying out real-time load prediction of a target user after the migration combination is completed.

Description

Short-term load prediction method based on migration learning under small sample set
Technical Field
The invention belongs to the technical field of power load prediction, and particularly relates to a short-term load prediction method under a small sample set based on transfer learning.
Background
Load prediction plays an extremely important role in the operation and control of modern power systems. However, with the large-scale renewable energy power generation grid connection, the popularization of electric automobiles and the increasing diversification of power consumption modes, the complexity and uncertainty of modern power systems are increasing. This presents additional challenges for the management of the power system. In response to the above problems, accurate residential load prediction can reduce the operating cost and promote intelligent operation of the power grid. For example, if we can get accurate and reliable load predictions for individual users, the detrimental effects of peak-to-valley usage can be curtailed by projects such as energy storage management, demand response, etc.
In order to achieve accurate and reliable load prediction, many methods of machine learning and deep learning are currently emerging. The machine learning method comprises the following steps: support vector machine regression (SVR), decision tree; the deep learning method comprises the following steps: deep res net, long-short-term memory neural network (LSTM). However, machine learning and deep learning models have two distinct drawbacks: firstly, a large amount of historical data is needed to train model parameters; and secondly, as a parameter model, the uncertainty of load prediction cannot be quantified.
However, resident side electricity usage is more irregular and more sensitive to consumer behavior than high voltage side. Meanwhile, in the power system, the lack of tagged history data is a very common problem. This results in the difficulty of the machine learning and deep learning methods described above to accomplish the tasks of deterministic prediction and uncertainty prediction in a small sample historical load data scenario. Therefore, there is a need for a method to accomplish load prediction tasks in the context of limited historical load data for residential users.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a short-term load prediction method under a small sample set based on transfer learning, which realizes accurate prediction of short-term load of a target user through the correlation of a source domain user and the target user in time and space under the condition of limited historical load data.
In order to achieve the above object, the invention provides a short-term load prediction method under a small sample set based on transfer learning, which is characterized by comprising the following steps:
(1) Data acquisition and pretreatment;
(1.1) setting a load sampling period T;
(1.2) historical load data x of M source domain users according to the load sampling period T load Corresponding temperature x temp Collecting to construct a data set and a temperature set, and recording an ith source domain user constructed data set as
Figure GDA0004094976870000021
And the temperature set is
Figure GDA0004094976870000022
wherein ,/>
Figure GDA0004094976870000023
Respectively representing historical load data and corresponding temperatures acquired by an ith source domain user at a t sampling moment, wherein i=1, 2, … and M;
(1.3) removing
Figure GDA0004094976870000024
and />
Figure GDA0004094976870000025
Abnormal value in (1), then performing linear interpolation to obtain data sample +.>
Figure GDA0004094976870000026
Finally, the data sample X i Normalization processing is carried out to obtain normalized data sample +.>
Figure GDA0004094976870000027
(1.4) as dataSample of
Figure GDA0004094976870000028
Add time characterization variable +.>
Figure GDA0004094976870000029
Comprises time sequence variable, day-level variable, holiday variable,>
Figure GDA00040949768700000210
taking the form of single thermal coding as input features, finally constructing M input features as +.>
Figure GDA00040949768700000211
(2) Building a depth residual error network model;
the depth residual network is formed by jumping and connecting L residual blocks, wherein each residual block consists of a convolution layer, a normalization layer and a Relu activation function layer;
(3) Training a depth residual network model based on source domain user data;
(3.1) from the ith input feature
Figure GDA00040949768700000212
Is selected randomly->
Figure GDA00040949768700000213
The data at each moment are used as training data for one round, < >>
Figure GDA00040949768700000214
Then sequentially inputting the data at each moment into a depth residual error network model, converting the frame data into tensor form through an input layer, and inputting the tensor form into serial residual error blocks;
(3.2) in the depth residual network model, the input tensor of the first residual block is set as Z (l-1) In the left branch of the first residual block, tensor Z (l-1) Feature extraction is carried out through a convolution kernel formed by a plurality of expansion causal convolutions, and then a convolution layer and a normalization layer are sequentially carried outA Relu activation function layer, a convolution layer and a normalization layer to obtain the output tensor of the left branch
Figure GDA00040949768700000215
In the right branch of the l-th residual block, tensor Z (l-1) Convolving 1 x 1 to make its output tensor +.>
Figure GDA00040949768700000216
Matching the dimension of the output tensor of the left branch, and then adding the output tensors of the two branches to obtain the output of the first residual block
Figure GDA00040949768700000217
Output Z of the first residual block (l) And the output of the (l-2) th residual block are added to obtain the input (Z) of the (l+1) th residual block l +Z (l-2) );
(3.3) repeating step (3.2) until the last residual block outputs Z (L) Finally Z is (L) Outputs of two full connection layers connected in parallel, the outputs of which are recorded as
Figure GDA0004094976870000031
And is used as a predicted value at the time t;
(3.4) after the training of the training data of the present round is completed, calculating a loss function value MAPE of the training of the present round:
Figure GDA0004094976870000032
wherein ,
Figure GDA0004094976870000033
an observation value at time t;
(3.5) setting a loss threshold delta; calculating the difference delta MAPE between the loss function value after the current round of training and the loss function value after the previous round of training, comparing the delta MAPE with delta, and if delta MAPE is less than or equal to delta, finishing training to obtain an ith depth residual error network model; otherwise, updating the weight in the depth residual error network by using a batch gradient descent algorithm, and returning to the step (3.1) to perform the next training;
(3.6) training the depth residual error network according to the M input features in the steps (3.1) - (3.5) to finally obtain M depth residual error network models which are marked as { F } 1 ,F 2 ,…,F i ,…,F M };
(4) Performing adaptive combination on the depth residual error network model by using a Bayes weighted probability averaging method;
(4.1) setting the acquisition period T of the target user 1
(4.2) according to the sampling period T 1 Sampling the small sample historical load data of the target user to obtain a load data set
Figure GDA0004094976870000034
And a temperature dataset
Figure GDA0004094976870000035
wherein />
Figure GDA0004094976870000036
and />
Figure GDA0004094976870000037
Respectively representing the historical load data and the corresponding temperature acquired by the target user at the t sampling time, and then constructing input features according to the steps (1.3) - (1.4), and marking as +.>
Figure GDA0004094976870000038
(4.3) constructing input feature pairs
Figure GDA0004094976870000039
wherein ,/>
Figure GDA00040949768700000310
Input features representing time t
Figure GDA00040949768700000311
Load observation values at time t are represented;
(4.4), input features
Figure GDA00040949768700000312
Respectively inputting into M depth residual error network models to obtain prediction output
Figure GDA00040949768700000313
wherein ,/>
Figure GDA00040949768700000314
Representing a predicted value of an ith depth residual error network model at a time t;
(4.5), calculation
Figure GDA00040949768700000315
Is>
Figure GDA00040949768700000316
Figure GDA00040949768700000317
Wherein N represents a Gaussian distribution,
Figure GDA0004094976870000041
represents gaussian noise, ω= { ω 1 ,ω 2 ,...,ω M The indication is given to
Figure GDA0004094976870000042
Different weights;
(4.6), calculation
Figure GDA0004094976870000043
The probability of (2) is as follows:
Figure GDA0004094976870000044
wherein I represents an identity matrix;
(4.7) assuming that the a priori obeying mean of the weights ω is 0 and the variance is Σ p Is a gaussian distribution of (c):
ω=N(0,Σ p )
calculating posterior probability distribution of the weight omega according to Bayesian inference theory:
Figure GDA0004094976870000045
wherein ,
Figure GDA0004094976870000046
(5) Predicting the real-time load;
(5.1) acquiring historical load data and temperature data of a target user in real time, and constructing input characteristics according to the steps (1.1) - (1.4)
Figure GDA0004094976870000047
/>
(5.2) according to the input characteristics
Figure GDA0004094976870000048
Calculating probability distribution of the target user real-time load predicted value:
Figure GDA00040949768700000410
wherein ,f* Is a probability distribution function;
(5.3) averaging the probability distribution
Figure GDA0004094976870000049
At T as target user 1 Load predicted value at +1.
The invention aims at realizing the following steps:
according to the short-term load prediction method under the small sample set based on transfer learning, historical load data and corresponding temperatures of a plurality of source domain users are collected, so that a plurality of input features are constructed; and training a plurality of depth residual error network models by using input features, carrying out adaptive migration combination on the depth residual error network models by using a Bayesian weighted probability averaging method, and carrying out real-time load prediction of a target user after the migration combination is completed.
Meanwhile, the short-term load prediction method based on the migration learning under the small sample set has the following beneficial effects:
(1) The depth residual error network adopted by the invention has a jump connection structure, so that the problems of information loss and gradient disappearance in the training process can be reduced, the information loss in the migration process can be reduced, and the robustness of the migration can be improved;
(2) Performing adaptive migration combination under the condition of limited load data of a target user, wherein the model combination process can eliminate the influence of negative migration information (namely, eliminate the influence of a source domain user with larger difference with the load characteristics of the target user), thereby constructing an optimal small sample load prediction model suitable for the target user;
(3) The invention adopts Bayesian weighted probability method to carry out self-adaptive combination on migration model, the process adopts maximum likelihood estimation, and utilizes maximized posterior probability to solve the weight of embedded model, the utilization rate of the method to the sample is 100%, and probability density function prediction is provided, thereby quantifying the uncertainty of prediction.
Drawings
FIG. 1 is a flow chart of a short-term load prediction method under a small sample set based on transfer learning;
FIG. 2 is a flow chart of data acquisition and preprocessing;
FIG. 3 is a schematic diagram of the structure of a depth residual network;
FIG. 4 is a deterministic load prediction curve of the method of the present invention versus several other methods;
fig. 5 is a graph showing probability density predictions for different prediction intervals for the method of the present invention.
Detailed Description
The following description of the embodiments of the invention is presented in conjunction with the accompanying drawings to provide a better understanding of the invention to those skilled in the art. It is to be expressly noted that in the description below, detailed descriptions of known functions and designs are omitted here as perhaps obscuring the present invention.
Examples
FIG. 1 is a flow chart of a short-term load prediction method under a small sample set based on transfer learning.
In this embodiment, the selected source domain user has a certain correlation with the target user load sequence, which belongs to a cell and a city, and this ensures the validity of migration knowledge. As shown in fig. 1, the short-term load prediction method based on the migration learning under the small sample set of the invention comprises the following steps:
s1, data acquisition and preprocessing, wherein the specific flow is shown in FIG. 2;
s1.1, setting a load sampling period T, wherein in the embodiment, the load sampling period is set to be 1 hour, namely 24 points/day section;
s1.2, historical load data x of M=19 source domain users according to load sampling period T load Corresponding temperature x temp Collecting to construct a data set and a temperature set, and recording an ith source domain user constructed data set as
Figure GDA0004094976870000061
And the temperature set is
Figure GDA0004094976870000062
wherein ,/>
Figure GDA0004094976870000063
Respectively representing historical load data and corresponding temperatures acquired by an ith source domain user at a t sampling moment, wherein i=1, 2, … and M;
s1.3, respectively eliminating
Figure GDA0004094976870000064
and />
Figure GDA0004094976870000065
Then, carrying out linear interpolation on the vacancy values to obtain a data sample +.>
Figure GDA0004094976870000066
Finally, the data sample X i Normalization processing is carried out to obtain normalized data sample +.>
Figure GDA0004094976870000067
S1.4, data sample
Figure GDA0004094976870000068
Add time characterization variable +.>
Figure GDA0004094976870000069
Includes a time sequence variable (i.e., hours of each day), a day variable (i.e., days of each week), a holiday variable (working day is 0, weekend is 1), and->
Figure GDA00040949768700000610
Taking the form of single thermal coding as input features, finally constructing M input features as +.>
Figure GDA00040949768700000611
S2, building a depth residual error network model;
as shown in fig. 3, the depth residual network is formed by L residual block hops. The specific connection process is as follows: the output of the first residual block is added to the output of the first+2 residual block as the input of the first+3 residual block, and the output of the first+1 residual block is added to the output of the first+3 residual block as the input … … of the first+4 residual block, where l=1, 2. The depth residual error network adopts a structural mode with jump connection, so that the problems of information loss and gradient disappearance in the training process can be reduced.
Each residual block is composed of a convolution layer, a normalization layer and a Relu activation function layer; the convolution kernel size of the convolution layer is set to 3×3.
In this embodiment, each residual block is expressed by an equation:
Figure GDA0004094976870000071
wherein ,Wl Parameters representing the first residual block; x is x l An input representing a first residual block; x is x l+1 Representing the output of the l-th residual block and also serving as the input of the l+1-th residual block; l=1, 2, …, L representing the residual block number;
s3, training a depth residual error network model based on source domain user data;
s3.1 from the ith input feature
Figure GDA0004094976870000072
Is selected randomly->
Figure GDA0004094976870000073
Data at each moment as one-round training data +.>
Figure GDA0004094976870000074
Then sequentially inputting the data at each moment into a depth residual error network model, converting the frame data into tensor form through an input layer, and inputting the tensor form into serial residual error blocks;
s3.2, in the depth residual error network model, setting the input tensor of the first residual error block as Z (l-1) In the left branch of the first residual block, tensor Z (l-1) Feature extraction is carried out through a convolution kernel formed by a plurality of expansion causal convolutions, and then a convolution layer, a normalization layer, a Relu activation function layer, a convolution layer and a normalization layer are sequentially carried out to obtain an output tensor of a left branch
Figure GDA0004094976870000075
In the right branch of the l-th residual block, tensor Z (l-1) Convolving 1 x 1 to make its output tensor +.>
Figure GDA0004094976870000076
Matching the dimension of the output tensor of the left branch, and then adding the output tensors of the two branches to obtain the output of the first residual block
Figure GDA0004094976870000077
Output Z of the first residual block (l) And the output of the (l-2) th residual block are added to obtain the input (Z) of the (l+1) th residual block l +Z (l-2) );
S3.3, repeating the step S3.2 until the last residual block outputs Z (L) Finally Z is (L) Outputs of two full connection layers connected in parallel, the outputs of which are recorded as
Figure GDA0004094976870000078
And is used as a predicted value at the time t;
s3.4, after the training of the training data of the present round is completed, calculating a loss function value MAPE of the training of the present round:
Figure GDA0004094976870000079
wherein ,
Figure GDA00040949768700000710
an observation value at time t;
s3.5, setting a loss threshold delta; calculating the difference delta MAPE between the loss function value after the current round of training and the loss function value after the previous round of training, comparing the delta MAPE with delta, and if delta MAPE is less than or equal to delta, finishing training to obtain an ith depth residual error network model; otherwise, updating the weight in the depth residual error network by using a batch gradient descent algorithm, and returning to the step S3.1 for the next round of training;
s3.6, training the depth residual error network by M input features according to the steps S3.1-S3.5, and finally obtaining M depth residual error network models which are marked as { F } 1 ,F 2 ,…,F i ,…,F M };
S4, performing adaptive combination on the depth residual error network model by using a Bayes weighted probability averaging method;
s4.1, setting the acquisition period T of the target user 1
S4.2 according to the sampling period T 1 The target user load value only containing limited historical load data (the reason for the lack of the historical load data of the user may be that the user is a new user, a newly moved user or the smart meter of the user is damaged, etc.), such as load data of 1-2 days
Figure GDA0004094976870000081
And temperature data->
Figure GDA0004094976870000082
Obtaining a load dataset
Figure GDA0004094976870000083
And a temperature dataset
Figure GDA0004094976870000084
wherein />
Figure GDA0004094976870000085
and />
Figure GDA0004094976870000086
Respectively representing the historical load data and the corresponding temperature acquired by the target user at the t sampling time, and then constructing input features according to steps S1.3-S1.4, and marking as +.>
Figure GDA0004094976870000087
S4.3, constructing input feature pairs
Figure GDA0004094976870000088
wherein ,/>
Figure GDA0004094976870000089
Input features representing time t
Figure GDA00040949768700000810
Load observation values at time t are represented;
s4.4 to input features
Figure GDA00040949768700000811
Respectively inputting into M depth residual error network models to obtain prediction output
Figure GDA00040949768700000812
wherein ,/>
Figure GDA00040949768700000813
Representing a predicted value of an ith depth residual error network model at a time t;
s4.5, calculating
Figure GDA00040949768700000814
Is>
Figure GDA00040949768700000815
Figure GDA00040949768700000816
/>
Where N represents obeying a gaussian distribution, ω= { ω 1 ,ω 2 ,...,ω M The indication is given to
Figure GDA00040949768700000817
Different weights, ++>
Figure GDA00040949768700000818
Representing the variance;
s4.6, calculating
Figure GDA00040949768700000819
The probability of (2) is as follows:
Figure GDA0004094976870000091
wherein I represents an identity matrix;
and S4.7, converting the model combination problem into posterior distribution of the solving weight omega. Bayesian disciplines require a priori assignment of ownership weights, which represents the confidence in the weights before they are observed. Assuming that the a priori obeying mean of the weight ω is 0 and the variance is Σ p Is a gaussian distribution of (c):
ω=N(0,Σ p )
calculating posterior probability distribution of the weight omega according to Bayesian inference theory:
Figure GDA0004094976870000092
wherein ,
Figure GDA0004094976870000093
mean value of omega posterior probability distribution
Figure GDA0004094976870000094
Sum of variances A -1 All 1 XM matrix, multiplying ω by Y, i.e. +.>
Figure GDA0004094976870000095
Thereby realizing the migration combination of M depth residual error network models. In this embodiment, migration combining is performed under the scenario of limited load data of the target user, and the model combining process can eliminate the influence of negative migration information, so as to construct an optimal small sample load prediction model suitable for the target user.
S5, predicting the real-time load;
s5.1, collecting historical load data and temperature data of a target user in real time, and constructing input characteristics according to steps S1.1-S1.4
Figure GDA0004094976870000096
S5.2, according to the input characteristics
Figure GDA0004094976870000097
Calculating probability distribution of the target user real-time load predicted value: />
Figure GDA0004094976870000101
wherein ,f* Is a probability distribution function;
s5.3, the average value in the probability distribution
Figure GDA0004094976870000102
At T as target user 1 Load predictive value at +1 moment, variance +.>
Figure GDA0004094976870000103
For evaluating target user at T 1 Uncertainty of load prediction at time +1.
And (3) verification:
in order to accurately and effectively evaluate deterministic prediction and uncertainty prediction, the invention evaluates deterministic prediction effects using MAPE, and CRPS evaluates uncertainty prediction effects, with the CRPS expression:
Figure GDA0004094976870000104
where CDF represents a cumulative distribution function. CRPS can comprehensively evaluate reliability of probability density estimation and interval sharpness.
Table 1 compares the deterministic predicted performance of the method of the present invention with several other methods under different users, i.e. the values of different model MAPEs. The linear regression model LR, the Gaussian process GP, the long-short-term memory neural network LSTM and the Bayesian linear regression BLR are trained only under limited historical data of a target user, the multi-element nuclear migration model M-BTMKR, the traditional weighted average method WA and the single migration model BI with the best performance are taken as migration prediction methods, and the model is obviously superior to other models under each situation. The average prediction error of the method is only 3.2%, which means that the method can be further applied to actual engineering and production. FIG. 4 is a graph of deterministic load prediction for the method of the present invention and several other methods. As can be seen from fig. 4, the predicted value of the method of the present invention is significantly closer to the actual value than other methods, which means that the method of the present invention has higher prediction accuracy than other methods.
Table 2 compares the uncertainty prediction performance of the method of the present invention with that of several other methods, namely, CRPS comparison of probability density function predictions, where L-GP represents the probability density function prediction of GP only under limited history, L-BLR represents the probability density function prediction of BLR only under limited history, S-GP represents the probability density function prediction of GP under sufficient history, and S-BLR represents the probability density function prediction of BLR under sufficient history. It can be seen that the comparison method is less effective than the method of the present invention, both in a limited historical data scenario and in a sufficient historical data scenario. Even though the S-GP model and the S-BLR model are trained on sufficient samples, their CRPS is inferior to the method of the present invention per user. Fig. 5 shows the prediction interval of the inventive method at 90% confidence, and it can be seen that substantially 90% of the samples fall within the prediction interval of the inventive method, which illustrates that the inventive method has higher reliability. Based on the prediction interval obtained by uncertainty estimation of the method, certain risk decision can be made in actual production, and the uncertainty of prediction can be quantified.
Table 1 shows the MAPE [% ] comparison for the different models;
Figure GDA0004094976870000111
table 2 is CRPS comparison for different models;
Figure GDA0004094976870000112
Figure GDA0004094976870000121
while the foregoing describes illustrative embodiments of the present invention to facilitate an understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as protected by the accompanying claims insofar as various changes are within the spirit and scope of the present invention as defined and defined by the appended claims.

Claims (2)

1. The short-term load prediction method based on the migration learning under the small sample set is characterized by comprising the following steps of:
(1) Data acquisition and pretreatment;
(1.1) setting a load sampling period T;
(1.2) historical load data x of M source domain users according to the load sampling period T load Corresponding temperature x temp Collecting to construct a data set and a temperature set, and recording an ith source domain user constructed data set as
Figure FDA0004078045120000011
And the temperature set is
Figure FDA0004078045120000012
wherein ,/>
Figure FDA0004078045120000013
Respectively representing historical load data and corresponding temperatures acquired by an ith source domain user at a t sampling moment, wherein i=1, 2, … and M;
(1.3) removing
Figure FDA0004078045120000014
and />
Figure FDA0004078045120000015
Abnormal value in the data sample is obtained by linear interpolation
Figure FDA0004078045120000016
Finally, the data sample X i Normalization processing is carried out to obtain normalized data sample +.>
Figure FDA0004078045120000017
(1.4) as data samples
Figure FDA0004078045120000018
Add time characterization variable +.>
Figure FDA0004078045120000019
Comprises time sequence variable, day-level variable, holiday variable,>
Figure FDA00040780451200000110
taking the form of single thermal coding as input features, finally constructing M input features as +.>
Figure FDA00040780451200000111
(2) Building a depth residual error network model;
the depth residual network is formed by jumping and connecting L residual blocks, wherein each residual block consists of a convolution layer, a normalization layer and a Relu activation function layer;
(3) Training a depth residual network model based on source domain user data;
(3.1) from the ith input feature
Figure FDA00040780451200000112
Is selected randomly->
Figure FDA00040780451200000113
The data at each moment are used as training data for one round, < >>
Figure FDA00040780451200000117
Then sequentially inputting the data at each moment into a depth residual error network model, converting the frame data into tensor form through an input layer, and inputting the tensor form into serial residual error blocks;
(3.2) in the depth residual network model, the input tensor of the first residual block is set as Z (l-1) In the left branch of the first residual block, tensor Z (l-1) Feature extraction is carried out through a convolution kernel formed by a plurality of expansion causal convolutions, and then a convolution layer, a normalization layer, a Relu activation function layer, a convolution layer and a normalization layer are sequentially carried out to obtain an output tensor of a left branch
Figure FDA00040780451200000114
In the right branch of the l-th residual block, tensor Z (l-1) Convolving 1 x 1 to make its output tensor +.>
Figure FDA00040780451200000115
Matching the dimension of the output tensor of the left branch, and then adding the output tensors of the two branches to obtain the output +.>
Figure FDA00040780451200000116
Output Z of the first residual block (l) And the output of the (l-2) th residual block are added to obtain the input (Z) of the (l+1) th residual block l +Z (l-2) );
(3.3) repeating step (3.2) until the last residual block outputs Z (L) Finally Z is (L) Outputs of two full connection layers connected in parallel, the outputs of which are recorded as
Figure FDA0004078045120000021
And is used as a predicted value at the time t;
(3.4) after the training of the training data of the present round is completed, calculating a loss function value MAPE of the training of the present round:
Figure FDA0004078045120000022
wherein ,
Figure FDA0004078045120000023
an observation value at time t;
(3.5) setting a loss threshold delta; calculating the difference delta MAPE between the loss function value after the current round of training and the loss function value after the previous round of training, comparing the delta MAPE with delta, and if delta MAPE is less than or equal to delta, finishing training to obtain an ith depth residual error network model; otherwise, updating the weight in the depth residual error network by using a batch gradient descent algorithm, and returning to the step (3.1) to perform the next training;
(3.6) training the depth residual error network according to the M input features in the steps (3.1) - (3.5) to finally obtain M depth residual error network models which are marked as { F } 1 ,F 2 ,…,F i ,…,F M };
(4) Performing adaptive combination on the depth residual error network model by using a Bayes weighted probability averaging method;
(4.1) setting the acquisition period T of the target user 1
(4.2) according to the sampling period T 1 Sampling the small sample historical load data of the target user to obtain a load data set
Figure FDA0004078045120000024
And a temperature dataset
Figure FDA0004078045120000025
wherein />
Figure FDA0004078045120000026
and />
Figure FDA0004078045120000027
Respectively representing the historical load data and the corresponding temperature acquired by the target user at the t sampling time, and then constructing input features according to the steps (1.3) - (1.4), and marking as +.>
Figure FDA0004078045120000028
(4.3) constructing input feature pairs
Figure FDA0004078045120000029
wherein ,/>
Figure FDA00040780451200000210
Input features representing time t
Figure FDA00040780451200000211
Load observation values at time t are represented;
(4.4), input features
Figure FDA00040780451200000212
Respectively inputting into M depth residual error network models to obtain prediction output
Figure FDA00040780451200000213
wherein ,/>
Figure FDA00040780451200000214
Figure FDA00040780451200000215
Representing a predicted value of an ith depth residual error network model at a time t;
(4.5), calculation
Figure FDA00040780451200000216
Is>
Figure FDA00040780451200000217
Figure FDA0004078045120000031
Wherein N represents a Gaussian distribution,
Figure FDA0004078045120000032
represents gaussian noise, ω= { ω 1 ,ω 2 ,...,ω M The indication is given to
Figure FDA0004078045120000033
Different weights;
(4.6), calculation
Figure FDA0004078045120000034
The probability of (2) is as follows:
Figure FDA0004078045120000035
wherein I represents an identity matrix;
(4.7) assuming that the a priori obeying mean of the weights ω is 0 and the variance is Σ p Is a gaussian distribution of (c):
ω=N(0,Σ p )
calculating posterior probability distribution of the weight omega according to Bayesian inference theory:
Figure FDA0004078045120000036
wherein ,
Figure FDA0004078045120000037
(5) Predicting the real-time load;
(5.1) acquiring historical load data and temperature data of a target user in real time, and constructing input characteristics according to the steps (1.1) - (1.4)
Figure FDA0004078045120000038
(5.2) according to the input characteristics
Figure FDA0004078045120000039
Calculating probability distribution of the target user real-time load predicted value:
Figure FDA00040780451200000310
wherein ,f* Is a probability distribution function;
(5.3) averaging the probability distribution
Figure FDA0004078045120000041
At T as target user 1 Load predicted value at +1.
2. The short-term load prediction method under a small sample set based on transfer learning according to claim 1, wherein the residual block is expressed as:
Figure FDA0004078045120000042
wherein ,Wl Parameters representing the first residual block; x is x l An input representing a first residual block; x is x l+1 Representing the output of the l-th residual block and also serving as the input of the l+1-th residual block; l=1, 2, …, L representing the residual block number.
CN202111442332.1A 2021-11-30 2021-11-30 Short-term load prediction method based on migration learning under small sample set Active CN114169416B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111442332.1A CN114169416B (en) 2021-11-30 2021-11-30 Short-term load prediction method based on migration learning under small sample set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111442332.1A CN114169416B (en) 2021-11-30 2021-11-30 Short-term load prediction method based on migration learning under small sample set

Publications (2)

Publication Number Publication Date
CN114169416A CN114169416A (en) 2022-03-11
CN114169416B true CN114169416B (en) 2023-04-21

Family

ID=80481698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111442332.1A Active CN114169416B (en) 2021-11-30 2021-11-30 Short-term load prediction method based on migration learning under small sample set

Country Status (1)

Country Link
CN (1) CN114169416B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102023000954A1 (en) 2023-03-13 2024-09-19 Mercedes-Benz Group AG Method for predicting the power requirements of electrical components of a vehicle arranged in an on-board network
CN117081082B (en) * 2023-10-17 2024-01-23 国网上海市电力公司 Active power distribution network operation situation sensing method and system based on Gaussian process regression

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103903067A (en) * 2014-04-09 2014-07-02 上海电机学院 Short-term combination forecasting method for wind power
CN109711620A (en) * 2018-12-26 2019-05-03 浙江大学 A kind of Short-Term Load Forecasting Method based on GRU neural network and transfer learning
WO2019141040A1 (en) * 2018-01-22 2019-07-25 佛山科学技术学院 Short term electrical load predication method
CN110969293A (en) * 2019-11-22 2020-04-07 上海交通大学 Short-term generalized load prediction method based on transfer learning
WO2021042935A1 (en) * 2019-09-05 2021-03-11 苏州大学 Bearing service life prediction method based on hidden markov model and transfer learning
CN113032916A (en) * 2021-03-03 2021-06-25 安徽大学 Electromechanical device bearing fault prediction method based on Bayesian network of transfer learning
CN113111578A (en) * 2021-04-01 2021-07-13 上海晨翘智能科技有限公司 Power load prediction method, power load prediction device, computer equipment and storage medium
CN113627659A (en) * 2021-07-29 2021-11-09 南京亚派软件技术有限公司 Garden demand side short-term load prediction system and method based on depth residual error network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103903067A (en) * 2014-04-09 2014-07-02 上海电机学院 Short-term combination forecasting method for wind power
WO2019141040A1 (en) * 2018-01-22 2019-07-25 佛山科学技术学院 Short term electrical load predication method
CN109711620A (en) * 2018-12-26 2019-05-03 浙江大学 A kind of Short-Term Load Forecasting Method based on GRU neural network and transfer learning
WO2021042935A1 (en) * 2019-09-05 2021-03-11 苏州大学 Bearing service life prediction method based on hidden markov model and transfer learning
CN110969293A (en) * 2019-11-22 2020-04-07 上海交通大学 Short-term generalized load prediction method based on transfer learning
CN113032916A (en) * 2021-03-03 2021-06-25 安徽大学 Electromechanical device bearing fault prediction method based on Bayesian network of transfer learning
CN113111578A (en) * 2021-04-01 2021-07-13 上海晨翘智能科技有限公司 Power load prediction method, power load prediction device, computer equipment and storage medium
CN113627659A (en) * 2021-07-29 2021-11-09 南京亚派软件技术有限公司 Garden demand side short-term load prediction system and method based on depth residual error network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Tehreem Ashfaq 等.Short-term Electricity Load and Price Forecasting using Enhanced KNN .2019 International Conference on Frontiers of Information Technology (FIT).2020,266-271. *
苏娟 等.基于模态组合的短期负荷预测方法.农业工程学报.2021,第37卷(第14期),186-196. *
赵鹏飞.基于深度残差网络的短期电力负荷预测研究.中国优秀硕士学位论文全文数据库工程科技Ⅱ辑.2023,(第01期),C042-1916. *

Also Published As

Publication number Publication date
CN114169416A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
CN110969290B (en) Runoff probability prediction method and system based on deep learning
CN111079989B (en) DWT-PCA-LSTM-based water supply amount prediction device for water supply company
CN111861013B (en) Power load prediction method and device
CN114169416B (en) Short-term load prediction method based on migration learning under small sample set
CN113128113B (en) Lean information building load prediction method based on deep learning and transfer learning
CN112100911B (en) Solar radiation prediction method based on depth BILSTM
CN114297036B (en) Data processing method, device, electronic equipment and readable storage medium
CN110910004A (en) Reservoir dispatching rule extraction method and system with multiple uncertainties
CN111461463A (en) Short-term load prediction method, system and equipment based on TCN-BP
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
CN109919356A (en) One kind being based on BP neural network section water demand prediction method
CN115907131B (en) Method and system for constructing electric heating load prediction model in northern area
CN112232604A (en) Prediction method for extracting network traffic based on Prophet model
CN115619028A (en) Clustering algorithm fusion-based power load accurate prediction method
CN115456287A (en) Long-and-short-term memory network-based multi-element load prediction method for comprehensive energy system
CN113240181B (en) Rolling simulation method and device for reservoir dispatching operation
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN118137582A (en) Multi-target dynamic scheduling method and system based on regional power system source network charge storage
CN111612648B (en) Training method and device for photovoltaic power generation prediction model and computer equipment
CN117252367A (en) Method and system for evaluating demand response potential of field countermeasure-based transfer learning model
CN113159395A (en) Deep learning-based sewage treatment plant water inflow prediction method and system
Guo et al. Short-Term Water Demand Forecast Based on Deep Neural Network:(029)
CN116632841A (en) Power distribution area short-term electricity load prediction method and system integrating multiple time sequence characteristics
Viana et al. Load forecasting benchmark for smart meter data
CN113836814B (en) Solar energy prediction method based on multi-flow neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant