CN114169416B - Short-term load prediction method based on migration learning under small sample set - Google Patents
Short-term load prediction method based on migration learning under small sample set Download PDFInfo
- Publication number
- CN114169416B CN114169416B CN202111442332.1A CN202111442332A CN114169416B CN 114169416 B CN114169416 B CN 114169416B CN 202111442332 A CN202111442332 A CN 202111442332A CN 114169416 B CN114169416 B CN 114169416B
- Authority
- CN
- China
- Prior art keywords
- data
- load
- residual block
- training
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000013508 migration Methods 0.000 title claims abstract description 20
- 230000005012 migration Effects 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000013526 transfer learning Methods 0.000 claims abstract description 8
- 238000012935 Averaging Methods 0.000 claims abstract description 7
- 230000003044 adaptive effect Effects 0.000 claims abstract description 6
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 44
- 230000006870 function Effects 0.000 claims description 22
- 238000005070 sampling Methods 0.000 claims description 18
- 238000009826 distribution Methods 0.000 claims description 17
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 claims description 15
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000005315 distribution function Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000001364 causal effect Effects 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000002159 abnormal effect Effects 0.000 claims description 2
- 230000009191 jumping Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 description 8
- 238000010801 machine learning Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 235000008694 Humulus lupulus Nutrition 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000004146 energy storage Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000017111 nuclear migration Effects 0.000 description 1
- 238000010248 power generation Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Bioinformatics & Computational Biology (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Water Supply & Treatment (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a short-term load prediction method under a small sample set based on transfer learning, which comprises the steps of firstly collecting historical load data and corresponding temperatures of a plurality of source domain users, so as to construct a plurality of input features; and training a plurality of depth residual error network models by using input features, carrying out adaptive migration combination on the depth residual error network models by using a Bayesian weighted probability averaging method, and carrying out real-time load prediction of a target user after the migration combination is completed.
Description
Technical Field
The invention belongs to the technical field of power load prediction, and particularly relates to a short-term load prediction method under a small sample set based on transfer learning.
Background
Load prediction plays an extremely important role in the operation and control of modern power systems. However, with the large-scale renewable energy power generation grid connection, the popularization of electric automobiles and the increasing diversification of power consumption modes, the complexity and uncertainty of modern power systems are increasing. This presents additional challenges for the management of the power system. In response to the above problems, accurate residential load prediction can reduce the operating cost and promote intelligent operation of the power grid. For example, if we can get accurate and reliable load predictions for individual users, the detrimental effects of peak-to-valley usage can be curtailed by projects such as energy storage management, demand response, etc.
In order to achieve accurate and reliable load prediction, many methods of machine learning and deep learning are currently emerging. The machine learning method comprises the following steps: support vector machine regression (SVR), decision tree; the deep learning method comprises the following steps: deep res net, long-short-term memory neural network (LSTM). However, machine learning and deep learning models have two distinct drawbacks: firstly, a large amount of historical data is needed to train model parameters; and secondly, as a parameter model, the uncertainty of load prediction cannot be quantified.
However, resident side electricity usage is more irregular and more sensitive to consumer behavior than high voltage side. Meanwhile, in the power system, the lack of tagged history data is a very common problem. This results in the difficulty of the machine learning and deep learning methods described above to accomplish the tasks of deterministic prediction and uncertainty prediction in a small sample historical load data scenario. Therefore, there is a need for a method to accomplish load prediction tasks in the context of limited historical load data for residential users.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a short-term load prediction method under a small sample set based on transfer learning, which realizes accurate prediction of short-term load of a target user through the correlation of a source domain user and the target user in time and space under the condition of limited historical load data.
In order to achieve the above object, the invention provides a short-term load prediction method under a small sample set based on transfer learning, which is characterized by comprising the following steps:
(1) Data acquisition and pretreatment;
(1.1) setting a load sampling period T;
(1.2) historical load data x of M source domain users according to the load sampling period T load Corresponding temperature x temp Collecting to construct a data set and a temperature set, and recording an ith source domain user constructed data set asAnd the temperature set is wherein ,/>Respectively representing historical load data and corresponding temperatures acquired by an ith source domain user at a t sampling moment, wherein i=1, 2, … and M;
(1.3) removing and />Abnormal value in (1), then performing linear interpolation to obtain data sample +.>Finally, the data sample X i Normalization processing is carried out to obtain normalized data sample +.>
(1.4) as dataSample ofAdd time characterization variable +.>Comprises time sequence variable, day-level variable, holiday variable,>taking the form of single thermal coding as input features, finally constructing M input features as +.>
(2) Building a depth residual error network model;
the depth residual network is formed by jumping and connecting L residual blocks, wherein each residual block consists of a convolution layer, a normalization layer and a Relu activation function layer;
(3) Training a depth residual network model based on source domain user data;
(3.1) from the ith input featureIs selected randomly->The data at each moment are used as training data for one round, < >>Then sequentially inputting the data at each moment into a depth residual error network model, converting the frame data into tensor form through an input layer, and inputting the tensor form into serial residual error blocks;
(3.2) in the depth residual network model, the input tensor of the first residual block is set as Z (l-1) In the left branch of the first residual block, tensor Z (l-1) Feature extraction is carried out through a convolution kernel formed by a plurality of expansion causal convolutions, and then a convolution layer and a normalization layer are sequentially carried outA Relu activation function layer, a convolution layer and a normalization layer to obtain the output tensor of the left branchIn the right branch of the l-th residual block, tensor Z (l-1) Convolving 1 x 1 to make its output tensor +.>Matching the dimension of the output tensor of the left branch, and then adding the output tensors of the two branches to obtain the output of the first residual blockOutput Z of the first residual block (l) And the output of the (l-2) th residual block are added to obtain the input (Z) of the (l+1) th residual block l +Z (l-2) );
(3.3) repeating step (3.2) until the last residual block outputs Z (L) Finally Z is (L) Outputs of two full connection layers connected in parallel, the outputs of which are recorded asAnd is used as a predicted value at the time t;
(3.4) after the training of the training data of the present round is completed, calculating a loss function value MAPE of the training of the present round:
(3.5) setting a loss threshold delta; calculating the difference delta MAPE between the loss function value after the current round of training and the loss function value after the previous round of training, comparing the delta MAPE with delta, and if delta MAPE is less than or equal to delta, finishing training to obtain an ith depth residual error network model; otherwise, updating the weight in the depth residual error network by using a batch gradient descent algorithm, and returning to the step (3.1) to perform the next training;
(3.6) training the depth residual error network according to the M input features in the steps (3.1) - (3.5) to finally obtain M depth residual error network models which are marked as { F } 1 ,F 2 ,…,F i ,…,F M };
(4) Performing adaptive combination on the depth residual error network model by using a Bayes weighted probability averaging method;
(4.1) setting the acquisition period T of the target user 1 ;
(4.2) according to the sampling period T 1 Sampling the small sample historical load data of the target user to obtain a load data setAnd a temperature dataset wherein /> and />Respectively representing the historical load data and the corresponding temperature acquired by the target user at the t sampling time, and then constructing input features according to the steps (1.3) - (1.4), and marking as +.>
(4.3) constructing input feature pairs wherein ,/>Input features representing time tLoad observation values at time t are represented;
(4.4), input featuresRespectively inputting into M depth residual error network models to obtain prediction output wherein ,/>Representing a predicted value of an ith depth residual error network model at a time t;
Wherein N represents a Gaussian distribution,represents gaussian noise, ω= { ω 1 ,ω 2 ,...,ω M The indication is given toDifferent weights;
wherein I represents an identity matrix;
(4.7) assuming that the a priori obeying mean of the weights ω is 0 and the variance is Σ p Is a gaussian distribution of (c):
ω=N(0,Σ p )
calculating posterior probability distribution of the weight omega according to Bayesian inference theory:
(5) Predicting the real-time load;
(5.1) acquiring historical load data and temperature data of a target user in real time, and constructing input characteristics according to the steps (1.1) - (1.4)/>
(5.2) according to the input characteristicsCalculating probability distribution of the target user real-time load predicted value:
wherein ,f* Is a probability distribution function;
The invention aims at realizing the following steps:
according to the short-term load prediction method under the small sample set based on transfer learning, historical load data and corresponding temperatures of a plurality of source domain users are collected, so that a plurality of input features are constructed; and training a plurality of depth residual error network models by using input features, carrying out adaptive migration combination on the depth residual error network models by using a Bayesian weighted probability averaging method, and carrying out real-time load prediction of a target user after the migration combination is completed.
Meanwhile, the short-term load prediction method based on the migration learning under the small sample set has the following beneficial effects:
(1) The depth residual error network adopted by the invention has a jump connection structure, so that the problems of information loss and gradient disappearance in the training process can be reduced, the information loss in the migration process can be reduced, and the robustness of the migration can be improved;
(2) Performing adaptive migration combination under the condition of limited load data of a target user, wherein the model combination process can eliminate the influence of negative migration information (namely, eliminate the influence of a source domain user with larger difference with the load characteristics of the target user), thereby constructing an optimal small sample load prediction model suitable for the target user;
(3) The invention adopts Bayesian weighted probability method to carry out self-adaptive combination on migration model, the process adopts maximum likelihood estimation, and utilizes maximized posterior probability to solve the weight of embedded model, the utilization rate of the method to the sample is 100%, and probability density function prediction is provided, thereby quantifying the uncertainty of prediction.
Drawings
FIG. 1 is a flow chart of a short-term load prediction method under a small sample set based on transfer learning;
FIG. 2 is a flow chart of data acquisition and preprocessing;
FIG. 3 is a schematic diagram of the structure of a depth residual network;
FIG. 4 is a deterministic load prediction curve of the method of the present invention versus several other methods;
fig. 5 is a graph showing probability density predictions for different prediction intervals for the method of the present invention.
Detailed Description
The following description of the embodiments of the invention is presented in conjunction with the accompanying drawings to provide a better understanding of the invention to those skilled in the art. It is to be expressly noted that in the description below, detailed descriptions of known functions and designs are omitted here as perhaps obscuring the present invention.
Examples
FIG. 1 is a flow chart of a short-term load prediction method under a small sample set based on transfer learning.
In this embodiment, the selected source domain user has a certain correlation with the target user load sequence, which belongs to a cell and a city, and this ensures the validity of migration knowledge. As shown in fig. 1, the short-term load prediction method based on the migration learning under the small sample set of the invention comprises the following steps:
s1, data acquisition and preprocessing, wherein the specific flow is shown in FIG. 2;
s1.1, setting a load sampling period T, wherein in the embodiment, the load sampling period is set to be 1 hour, namely 24 points/day section;
s1.2, historical load data x of M=19 source domain users according to load sampling period T load Corresponding temperature x temp Collecting to construct a data set and a temperature set, and recording an ith source domain user constructed data set asAnd the temperature set is wherein ,/>Respectively representing historical load data and corresponding temperatures acquired by an ith source domain user at a t sampling moment, wherein i=1, 2, … and M;
s1.3, respectively eliminating and />Then, carrying out linear interpolation on the vacancy values to obtain a data sample +.>Finally, the data sample X i Normalization processing is carried out to obtain normalized data sample +.>
S1.4, data sampleAdd time characterization variable +.>Includes a time sequence variable (i.e., hours of each day), a day variable (i.e., days of each week), a holiday variable (working day is 0, weekend is 1), and->Taking the form of single thermal coding as input features, finally constructing M input features as +.>
S2, building a depth residual error network model;
as shown in fig. 3, the depth residual network is formed by L residual block hops. The specific connection process is as follows: the output of the first residual block is added to the output of the first+2 residual block as the input of the first+3 residual block, and the output of the first+1 residual block is added to the output of the first+3 residual block as the input … … of the first+4 residual block, where l=1, 2. The depth residual error network adopts a structural mode with jump connection, so that the problems of information loss and gradient disappearance in the training process can be reduced.
Each residual block is composed of a convolution layer, a normalization layer and a Relu activation function layer; the convolution kernel size of the convolution layer is set to 3×3.
In this embodiment, each residual block is expressed by an equation:
wherein ,Wl Parameters representing the first residual block; x is x l An input representing a first residual block; x is x l+1 Representing the output of the l-th residual block and also serving as the input of the l+1-th residual block; l=1, 2, …, L representing the residual block number;
s3, training a depth residual error network model based on source domain user data;
s3.1 from the ith input featureIs selected randomly->Data at each moment as one-round training data +.>Then sequentially inputting the data at each moment into a depth residual error network model, converting the frame data into tensor form through an input layer, and inputting the tensor form into serial residual error blocks;
s3.2, in the depth residual error network model, setting the input tensor of the first residual error block as Z (l-1) In the left branch of the first residual block, tensor Z (l-1) Feature extraction is carried out through a convolution kernel formed by a plurality of expansion causal convolutions, and then a convolution layer, a normalization layer, a Relu activation function layer, a convolution layer and a normalization layer are sequentially carried out to obtain an output tensor of a left branchIn the right branch of the l-th residual block, tensor Z (l-1) Convolving 1 x 1 to make its output tensor +.>Matching the dimension of the output tensor of the left branch, and then adding the output tensors of the two branches to obtain the output of the first residual blockOutput Z of the first residual block (l) And the output of the (l-2) th residual block are added to obtain the input (Z) of the (l+1) th residual block l +Z (l-2) );
S3.3, repeating the step S3.2 until the last residual block outputs Z (L) Finally Z is (L) Outputs of two full connection layers connected in parallel, the outputs of which are recorded asAnd is used as a predicted value at the time t;
s3.4, after the training of the training data of the present round is completed, calculating a loss function value MAPE of the training of the present round:
s3.5, setting a loss threshold delta; calculating the difference delta MAPE between the loss function value after the current round of training and the loss function value after the previous round of training, comparing the delta MAPE with delta, and if delta MAPE is less than or equal to delta, finishing training to obtain an ith depth residual error network model; otherwise, updating the weight in the depth residual error network by using a batch gradient descent algorithm, and returning to the step S3.1 for the next round of training;
s3.6, training the depth residual error network by M input features according to the steps S3.1-S3.5, and finally obtaining M depth residual error network models which are marked as { F } 1 ,F 2 ,…,F i ,…,F M };
S4, performing adaptive combination on the depth residual error network model by using a Bayes weighted probability averaging method;
s4.1, setting the acquisition period T of the target user 1 ;
S4.2 according to the sampling period T 1 The target user load value only containing limited historical load data (the reason for the lack of the historical load data of the user may be that the user is a new user, a newly moved user or the smart meter of the user is damaged, etc.), such as load data of 1-2 daysAnd temperature data->Obtaining a load datasetAnd a temperature dataset wherein /> and />Respectively representing the historical load data and the corresponding temperature acquired by the target user at the t sampling time, and then constructing input features according to steps S1.3-S1.4, and marking as +.>
S4.3, constructing input feature pairs wherein ,/>Input features representing time tLoad observation values at time t are represented;
s4.4 to input featuresRespectively inputting into M depth residual error network models to obtain prediction output wherein ,/>Representing a predicted value of an ith depth residual error network model at a time t;
Where N represents obeying a gaussian distribution, ω= { ω 1 ,ω 2 ,...,ω M The indication is given toDifferent weights, ++>Representing the variance;
wherein I represents an identity matrix;
and S4.7, converting the model combination problem into posterior distribution of the solving weight omega. Bayesian disciplines require a priori assignment of ownership weights, which represents the confidence in the weights before they are observed. Assuming that the a priori obeying mean of the weight ω is 0 and the variance is Σ p Is a gaussian distribution of (c):
ω=N(0,Σ p )
calculating posterior probability distribution of the weight omega according to Bayesian inference theory:
mean value of omega posterior probability distributionSum of variances A -1 All 1 XM matrix, multiplying ω by Y, i.e. +.>Thereby realizing the migration combination of M depth residual error network models. In this embodiment, migration combining is performed under the scenario of limited load data of the target user, and the model combining process can eliminate the influence of negative migration information, so as to construct an optimal small sample load prediction model suitable for the target user.
S5, predicting the real-time load;
s5.1, collecting historical load data and temperature data of a target user in real time, and constructing input characteristics according to steps S1.1-S1.4
S5.2, according to the input characteristicsCalculating probability distribution of the target user real-time load predicted value: />
wherein ,f* Is a probability distribution function;
s5.3, the average value in the probability distributionAt T as target user 1 Load predictive value at +1 moment, variance +.>For evaluating target user at T 1 Uncertainty of load prediction at time +1.
And (3) verification:
in order to accurately and effectively evaluate deterministic prediction and uncertainty prediction, the invention evaluates deterministic prediction effects using MAPE, and CRPS evaluates uncertainty prediction effects, with the CRPS expression:
where CDF represents a cumulative distribution function. CRPS can comprehensively evaluate reliability of probability density estimation and interval sharpness.
Table 1 compares the deterministic predicted performance of the method of the present invention with several other methods under different users, i.e. the values of different model MAPEs. The linear regression model LR, the Gaussian process GP, the long-short-term memory neural network LSTM and the Bayesian linear regression BLR are trained only under limited historical data of a target user, the multi-element nuclear migration model M-BTMKR, the traditional weighted average method WA and the single migration model BI with the best performance are taken as migration prediction methods, and the model is obviously superior to other models under each situation. The average prediction error of the method is only 3.2%, which means that the method can be further applied to actual engineering and production. FIG. 4 is a graph of deterministic load prediction for the method of the present invention and several other methods. As can be seen from fig. 4, the predicted value of the method of the present invention is significantly closer to the actual value than other methods, which means that the method of the present invention has higher prediction accuracy than other methods.
Table 2 compares the uncertainty prediction performance of the method of the present invention with that of several other methods, namely, CRPS comparison of probability density function predictions, where L-GP represents the probability density function prediction of GP only under limited history, L-BLR represents the probability density function prediction of BLR only under limited history, S-GP represents the probability density function prediction of GP under sufficient history, and S-BLR represents the probability density function prediction of BLR under sufficient history. It can be seen that the comparison method is less effective than the method of the present invention, both in a limited historical data scenario and in a sufficient historical data scenario. Even though the S-GP model and the S-BLR model are trained on sufficient samples, their CRPS is inferior to the method of the present invention per user. Fig. 5 shows the prediction interval of the inventive method at 90% confidence, and it can be seen that substantially 90% of the samples fall within the prediction interval of the inventive method, which illustrates that the inventive method has higher reliability. Based on the prediction interval obtained by uncertainty estimation of the method, certain risk decision can be made in actual production, and the uncertainty of prediction can be quantified.
Table 1 shows the MAPE [% ] comparison for the different models;
table 2 is CRPS comparison for different models;
while the foregoing describes illustrative embodiments of the present invention to facilitate an understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as protected by the accompanying claims insofar as various changes are within the spirit and scope of the present invention as defined and defined by the appended claims.
Claims (2)
1. The short-term load prediction method based on the migration learning under the small sample set is characterized by comprising the following steps of:
(1) Data acquisition and pretreatment;
(1.1) setting a load sampling period T;
(1.2) historical load data x of M source domain users according to the load sampling period T load Corresponding temperature x temp Collecting to construct a data set and a temperature set, and recording an ith source domain user constructed data set asAnd the temperature set is wherein ,/>Respectively representing historical load data and corresponding temperatures acquired by an ith source domain user at a t sampling moment, wherein i=1, 2, … and M;
(1.3) removing and />Abnormal value in the data sample is obtained by linear interpolationFinally, the data sample X i Normalization processing is carried out to obtain normalized data sample +.>
(1.4) as data samplesAdd time characterization variable +.>Comprises time sequence variable, day-level variable, holiday variable,>taking the form of single thermal coding as input features, finally constructing M input features as +.>
(2) Building a depth residual error network model;
the depth residual network is formed by jumping and connecting L residual blocks, wherein each residual block consists of a convolution layer, a normalization layer and a Relu activation function layer;
(3) Training a depth residual network model based on source domain user data;
(3.1) from the ith input featureIs selected randomly->The data at each moment are used as training data for one round, < >>Then sequentially inputting the data at each moment into a depth residual error network model, converting the frame data into tensor form through an input layer, and inputting the tensor form into serial residual error blocks;
(3.2) in the depth residual network model, the input tensor of the first residual block is set as Z (l-1) In the left branch of the first residual block, tensor Z (l-1) Feature extraction is carried out through a convolution kernel formed by a plurality of expansion causal convolutions, and then a convolution layer, a normalization layer, a Relu activation function layer, a convolution layer and a normalization layer are sequentially carried out to obtain an output tensor of a left branchIn the right branch of the l-th residual block, tensor Z (l-1) Convolving 1 x 1 to make its output tensor +.>Matching the dimension of the output tensor of the left branch, and then adding the output tensors of the two branches to obtain the output +.>Output Z of the first residual block (l) And the output of the (l-2) th residual block are added to obtain the input (Z) of the (l+1) th residual block l +Z (l-2) );
(3.3) repeating step (3.2) until the last residual block outputs Z (L) Finally Z is (L) Outputs of two full connection layers connected in parallel, the outputs of which are recorded asAnd is used as a predicted value at the time t;
(3.4) after the training of the training data of the present round is completed, calculating a loss function value MAPE of the training of the present round:
(3.5) setting a loss threshold delta; calculating the difference delta MAPE between the loss function value after the current round of training and the loss function value after the previous round of training, comparing the delta MAPE with delta, and if delta MAPE is less than or equal to delta, finishing training to obtain an ith depth residual error network model; otherwise, updating the weight in the depth residual error network by using a batch gradient descent algorithm, and returning to the step (3.1) to perform the next training;
(3.6) training the depth residual error network according to the M input features in the steps (3.1) - (3.5) to finally obtain M depth residual error network models which are marked as { F } 1 ,F 2 ,…,F i ,…,F M };
(4) Performing adaptive combination on the depth residual error network model by using a Bayes weighted probability averaging method;
(4.1) setting the acquisition period T of the target user 1 ;
(4.2) according to the sampling period T 1 Sampling the small sample historical load data of the target user to obtain a load data setAnd a temperature dataset wherein /> and />Respectively representing the historical load data and the corresponding temperature acquired by the target user at the t sampling time, and then constructing input features according to the steps (1.3) - (1.4), and marking as +.>
(4.3) constructing input feature pairs wherein ,/>Input features representing time tLoad observation values at time t are represented;
(4.4), input featuresRespectively inputting into M depth residual error network models to obtain prediction output wherein ,/> Representing a predicted value of an ith depth residual error network model at a time t;
Wherein N represents a Gaussian distribution,represents gaussian noise, ω= { ω 1 ,ω 2 ,...,ω M The indication is given toDifferent weights;
wherein I represents an identity matrix;
(4.7) assuming that the a priori obeying mean of the weights ω is 0 and the variance is Σ p Is a gaussian distribution of (c):
ω=N(0,Σ p )
calculating posterior probability distribution of the weight omega according to Bayesian inference theory:
(5) Predicting the real-time load;
(5.1) acquiring historical load data and temperature data of a target user in real time, and constructing input characteristics according to the steps (1.1) - (1.4)
(5.2) according to the input characteristicsCalculating probability distribution of the target user real-time load predicted value:
wherein ,f* Is a probability distribution function;
2. The short-term load prediction method under a small sample set based on transfer learning according to claim 1, wherein the residual block is expressed as:
wherein ,Wl Parameters representing the first residual block; x is x l An input representing a first residual block; x is x l+1 Representing the output of the l-th residual block and also serving as the input of the l+1-th residual block; l=1, 2, …, L representing the residual block number.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111442332.1A CN114169416B (en) | 2021-11-30 | 2021-11-30 | Short-term load prediction method based on migration learning under small sample set |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111442332.1A CN114169416B (en) | 2021-11-30 | 2021-11-30 | Short-term load prediction method based on migration learning under small sample set |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114169416A CN114169416A (en) | 2022-03-11 |
CN114169416B true CN114169416B (en) | 2023-04-21 |
Family
ID=80481698
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111442332.1A Active CN114169416B (en) | 2021-11-30 | 2021-11-30 | Short-term load prediction method based on migration learning under small sample set |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114169416B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102023000954A1 (en) | 2023-03-13 | 2024-09-19 | Mercedes-Benz Group AG | Method for predicting the power requirements of electrical components of a vehicle arranged in an on-board network |
CN117081082B (en) * | 2023-10-17 | 2024-01-23 | 国网上海市电力公司 | Active power distribution network operation situation sensing method and system based on Gaussian process regression |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103903067A (en) * | 2014-04-09 | 2014-07-02 | 上海电机学院 | Short-term combination forecasting method for wind power |
CN109711620A (en) * | 2018-12-26 | 2019-05-03 | 浙江大学 | A kind of Short-Term Load Forecasting Method based on GRU neural network and transfer learning |
WO2019141040A1 (en) * | 2018-01-22 | 2019-07-25 | 佛山科学技术学院 | Short term electrical load predication method |
CN110969293A (en) * | 2019-11-22 | 2020-04-07 | 上海交通大学 | Short-term generalized load prediction method based on transfer learning |
WO2021042935A1 (en) * | 2019-09-05 | 2021-03-11 | 苏州大学 | Bearing service life prediction method based on hidden markov model and transfer learning |
CN113032916A (en) * | 2021-03-03 | 2021-06-25 | 安徽大学 | Electromechanical device bearing fault prediction method based on Bayesian network of transfer learning |
CN113111578A (en) * | 2021-04-01 | 2021-07-13 | 上海晨翘智能科技有限公司 | Power load prediction method, power load prediction device, computer equipment and storage medium |
CN113627659A (en) * | 2021-07-29 | 2021-11-09 | 南京亚派软件技术有限公司 | Garden demand side short-term load prediction system and method based on depth residual error network |
-
2021
- 2021-11-30 CN CN202111442332.1A patent/CN114169416B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103903067A (en) * | 2014-04-09 | 2014-07-02 | 上海电机学院 | Short-term combination forecasting method for wind power |
WO2019141040A1 (en) * | 2018-01-22 | 2019-07-25 | 佛山科学技术学院 | Short term electrical load predication method |
CN109711620A (en) * | 2018-12-26 | 2019-05-03 | 浙江大学 | A kind of Short-Term Load Forecasting Method based on GRU neural network and transfer learning |
WO2021042935A1 (en) * | 2019-09-05 | 2021-03-11 | 苏州大学 | Bearing service life prediction method based on hidden markov model and transfer learning |
CN110969293A (en) * | 2019-11-22 | 2020-04-07 | 上海交通大学 | Short-term generalized load prediction method based on transfer learning |
CN113032916A (en) * | 2021-03-03 | 2021-06-25 | 安徽大学 | Electromechanical device bearing fault prediction method based on Bayesian network of transfer learning |
CN113111578A (en) * | 2021-04-01 | 2021-07-13 | 上海晨翘智能科技有限公司 | Power load prediction method, power load prediction device, computer equipment and storage medium |
CN113627659A (en) * | 2021-07-29 | 2021-11-09 | 南京亚派软件技术有限公司 | Garden demand side short-term load prediction system and method based on depth residual error network |
Non-Patent Citations (3)
Title |
---|
Tehreem Ashfaq 等.Short-term Electricity Load and Price Forecasting using Enhanced KNN .2019 International Conference on Frontiers of Information Technology (FIT).2020,266-271. * |
苏娟 等.基于模态组合的短期负荷预测方法.农业工程学报.2021,第37卷(第14期),186-196. * |
赵鹏飞.基于深度残差网络的短期电力负荷预测研究.中国优秀硕士学位论文全文数据库工程科技Ⅱ辑.2023,(第01期),C042-1916. * |
Also Published As
Publication number | Publication date |
---|---|
CN114169416A (en) | 2022-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110969290B (en) | Runoff probability prediction method and system based on deep learning | |
CN111079989B (en) | DWT-PCA-LSTM-based water supply amount prediction device for water supply company | |
CN111861013B (en) | Power load prediction method and device | |
CN114169416B (en) | Short-term load prediction method based on migration learning under small sample set | |
CN113128113B (en) | Lean information building load prediction method based on deep learning and transfer learning | |
CN112100911B (en) | Solar radiation prediction method based on depth BILSTM | |
CN114297036B (en) | Data processing method, device, electronic equipment and readable storage medium | |
CN110910004A (en) | Reservoir dispatching rule extraction method and system with multiple uncertainties | |
CN111461463A (en) | Short-term load prediction method, system and equipment based on TCN-BP | |
CN113554466A (en) | Short-term power consumption prediction model construction method, prediction method and device | |
CN109919356A (en) | One kind being based on BP neural network section water demand prediction method | |
CN115907131B (en) | Method and system for constructing electric heating load prediction model in northern area | |
CN112232604A (en) | Prediction method for extracting network traffic based on Prophet model | |
CN115619028A (en) | Clustering algorithm fusion-based power load accurate prediction method | |
CN115456287A (en) | Long-and-short-term memory network-based multi-element load prediction method for comprehensive energy system | |
CN113240181B (en) | Rolling simulation method and device for reservoir dispatching operation | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
CN118137582A (en) | Multi-target dynamic scheduling method and system based on regional power system source network charge storage | |
CN111612648B (en) | Training method and device for photovoltaic power generation prediction model and computer equipment | |
CN117252367A (en) | Method and system for evaluating demand response potential of field countermeasure-based transfer learning model | |
CN113159395A (en) | Deep learning-based sewage treatment plant water inflow prediction method and system | |
Guo et al. | Short-Term Water Demand Forecast Based on Deep Neural Network:(029) | |
CN116632841A (en) | Power distribution area short-term electricity load prediction method and system integrating multiple time sequence characteristics | |
Viana et al. | Load forecasting benchmark for smart meter data | |
CN113836814B (en) | Solar energy prediction method based on multi-flow neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |