CN113743668B - Household electricity-oriented short-term load prediction method - Google Patents
Household electricity-oriented short-term load prediction method Download PDFInfo
- Publication number
- CN113743668B CN113743668B CN202111045733.3A CN202111045733A CN113743668B CN 113743668 B CN113743668 B CN 113743668B CN 202111045733 A CN202111045733 A CN 202111045733A CN 113743668 B CN113743668 B CN 113743668B
- Authority
- CN
- China
- Prior art keywords
- data
- attention
- layer
- lstm
- term
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000007246 mechanism Effects 0.000 claims abstract description 22
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 13
- 230000008569 process Effects 0.000 claims abstract description 8
- 230000005611 electricity Effects 0.000 claims abstract description 6
- 238000010606 normalization Methods 0.000 claims description 15
- 238000005457 optimization Methods 0.000 claims description 9
- 230000008859 change Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 238000012804 iterative process Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000012897 Levenberg–Marquardt algorithm Methods 0.000 claims description 3
- 238000013075 data extraction Methods 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 2
- 230000006870 function Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 8
- 238000012549 training Methods 0.000 description 4
- 238000007792 addition Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
- H02J3/003—Load forecast, e.g. methods or systems for forecasting future load demand
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Artificial Intelligence (AREA)
- Human Resources & Organizations (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Strategic Management (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Power Engineering (AREA)
- Water Supply & Treatment (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Public Health (AREA)
- Entrepreneurship & Innovation (AREA)
- Primary Health Care (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a short-term load prediction method for household electricity, which introduces a residual mechanism into an LSTM network to construct a residual LSTM module, and introduces a Scaled Dot-Product Attention mechanism into a decoding process to construct an Encoder-Decoder model; according to the invention, the fuzzy clustering algorithm is used for extracting similar daily data, and the data are normalized, so that the problems of large similarity and non-uniform dimension among the data are solved; and aiming at the problem of long input sequence information and correlation loss between sequences, data is input into an Encoder-Decoder model combined with Scaled Dot-Product Attention, so that the weights of elements inside intermediate codes relied on by each time of the Decoder on output are different, and the influence of key factors is highlighted.
Description
Technical Field
The invention belongs to the technical field of load prediction, and particularly relates to a household electricity-oriented short-term load prediction method.
Background
The electric load prediction may be classified into a short-term load prediction and a medium-long term load prediction. Short-term load prediction refers to daily load prediction and weekly load prediction, and is mainly used for short-term power grid operation mode arrangement, static safety analysis, scheduled maintenance arrangement and the like.
In the field of short-term load prediction, there are methods such as artificial neural networks (Artificial Neural Networks, ANN), wavelet transforms (Wavelet Transform), fuzzy Logic (FL), and combined prediction methods. The method of load prediction gradually converts to a prediction sequence, the model is converted into a Sequence to Sequence model with a attention mechanism from a single LSTM model, and the LSTM network encodes all input features into a vector representation with fixed length, neglects the size of the correlation between the input features and the load to be predicted, and cannot use the historical data with emphasis. And adding an attention mechanism into the prediction model, and distributing more attention to important data by calculating the attention weights of different input quantity characteristics, so that the load prediction accuracy is improved. The neural network algorithm has slow learning and convergence speed, is easy to fall into a local minimum value, and has the condition of non-convergence; the wavelet transform method has troublesome selection of decomposition scale and wavelet basis; the fuzzy logic algorithm mapping is not thin enough and has weak learning ability; the combined model is difficult to tune, and the weight of each algorithm is difficult to determine.
In the short-term power load prediction, by observing the load curve, it is found that when the load is at the "inflection point" (e.g., about 7 a.m. on the workday, the actual load value greatly increases), and the prediction accuracy of the load predicted by the conventional method is low, sometimes only about 90%. The load curve of the same time period of similar days has little change, and the load is more in a similar change rule in the same time period of the latest several same types of days, so that the traditional method is difficult to accurately predict.
In the conventional Encoder-Decoder structure, the Encoder encodes all input sequences, regardless of length or length, into a fixed length semantic feature c for re-decoding, which results in two problems:
(1) Information loss for long input sequences;
(2) Original structural information of the sequence is lost, interrelation information between sequence data is ignored, and accuracy is reduced due to the two information.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides a short-term load prediction method for household electricity, which is based on an LSTM network combined with a Scaled Dot-Product Attention mechanism, and introduces an Attention mechanism into an Encoder-Decoder model to effectively highlight factors influencing load and solve the problems of time sequence and nonlinear regression existing in load data of a power system, thereby improving the prediction effect.
In order to solve the technical problems, the invention adopts the following technical scheme:
a short-term load prediction method for household electricity uses introduces a residual mechanism into an LSTM network to construct a residual LSTM module, and introduces a Scaled Dot-Product Attention mechanism into a decoding process to construct an Encoder-Decoder model, and the specific method comprises the following steps:
step 1: acquiring historical load data;
step 2: data preprocessing: extracting similar day data through an FCM fuzzy clustering algorithm, and carrying out normalization processing on the data;
step 3: input to the residual LSTM module: adding the input and the output together and then activating the input and the output through a Dropout function;
step 4: attention layer dot product operation: data is input into an Encoder-Decode model which combines a Scaled Dot-Product Attention mechanism, so that the weights of elements inside intermediate codes relied on by each moment of the Decode are different; wherein the Attention is added to the intermediate result of the Decoder;
step 5: ffnn+softmax optimization: after the Attention layer, an FFNN+Softmax layer is introduced for optimization, the result obtained by calculation of the Attention layer is transmitted to the FFNN+Softmax, and the FFNN is trained by a Levenberg-Marquardt algorithm.
Further, in step 2, the specific implementation steps are as follows:
(1) Similar day data extraction: in FCM, a certain data is not limited to a specific cluster, but can be respectively affiliated to a plurality of clusters according to different membership degrees, thus, a data set x= { X containing n elements is given 1 ,x 2 ,x 3 ,…,x n -it needs to be decomposed into c fuzzy clusters, minimizing the objective function as:
wherein m is any real number greater than 1; u (u) ij Is x i Membership to the jth cluster; x is x i Is the i-th element in the set X, and the dimension is d dimension; c j Is the center of the j-th cluster; ii is an algorithm that calculates the distance of the data from the cluster center;
FCM is a method for optimizing membership u by continuously iterating the objective function ij And cluster center c j To accomplish this, the iterative expression is:
let ε be the threshold of the iterative process when it is satisfied thatAt the end of the iterative process, the process is considered to converge to a local minimum point J;
(2) Input data normalization
Because of dimensional differences among different dimensions of input data, the model training and prediction effects are affected, normalization processing is adopted on the data, the data is mapped to the [ -1,1] interval, and the formula is as follows:
in which x is max 、x min The maximum and minimum values of the variables, respectively.
Further, the input data X and the LSTM output data F (X) are added together to obtain F (X) +X, and then activated.
Further, in the Encoder-Decode model of step 4, where there are two LSTM layers in the Decode, the position of the addition of the Attention is between the two LSTM hidden layers, and the Attention is given to the second LSTM layer along with the output of the first LSTM layer.
Further, the equation of the Attention mechanism is as follows:
q represents a query term, K represents a calculated attention term, V represents a value of the calculated attention term, and d represents a value of the calculated attention term k As a normalization, d k Representing the dimension of the calculated attention item K; determining the weight distribution of the value V of the calculated attention item by inquiring the similarity degree of the term Q and the calculated attention item K, and dividing the weight distribution by a scaling factor to reduce the weight distribution at d k When the dot product is large, the influence caused by the large dimension of the result is large;
the dimension of the load sequence data is d, the size of Q is n×d, K, V is the same, and the ith load sequence data is v i The Attention of this sequence should be:
and simultaneously making the attitudes of all sequences, wherein the attitudes are as follows:
further, after the FFNN network in step 5, a residual connection and a peer layer are added to fix the mean and variance of the inputs of neurons in one layer, so as to reduce the influence of the change of the output of the peer layer on the input of the next layer.
Compared with the prior art, the invention has the advantages that:
(1) The invention extracts data of similar days by using a fuzzy clustering algorithm (FCM), thereby solving the problem of small load curve change (namely large similarity between data) in the same period and obviously improving the precision of a prediction result. Because of the dimensional difference among different dimensions of the input data, the model training and prediction effects can be affected, the data are mapped to the [ -1,1] interval by adopting normalization processing, the problem of non-uniform data dimensions is solved, and the accuracy of short-term load prediction is improved.
(2) The invention introduces an attribute mechanism into the Encoder-Decoder model, and the attribute gives different weights to the input features of the Encoder-Decoder model, highlights more key influencing factors, helps the model to make more accurate judgment, and does not increase the calculation and storage cost of the model; the Scaled Dot-Product Attention is introduced into the Encoder-Decoder model to highlight the influence of key factors, so that global connection is captured in one step, the problem of long-distance dependence is solved, and the prediction effect is improved.
(3) After the Attention layer, an FFNN+Softmax layer is introduced for optimization, so that the predicted result is more accurate.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of an LSTM residual module according to the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific examples.
In combination with the short-term load prediction flow combined with Scaled Dot-Product Attention based on LSTM shown in FIG. 1, the invention provides a short-term load prediction method for household electricity, which introduces a residual mechanism into an LSTM network to construct a residual LSTM module, and introduces a Scaled Dot-Product Attention mechanism into a decoding process to construct an Encoder-Decoder model, and the specific method comprises the following steps:
step 1: acquiring historical load data:
the present example selects a dataset from the entsu-E platform containing a sequence of actual and predicted loads per hour from 2015, 1 to 2017, 5 in switzerland, a sequence of hours of temperature (in °f) and a map of qualitative weather in one of 3 categories defined in the profile.
Step 2: data preprocessing:
and extracting similar day data through an FCM fuzzy clustering algorithm, and carrying out normalization processing on the data.
(1) Similar day data extraction: in FCM, certain data is not limited to a particular cluster, but may be respectively affiliated to a plurality of clusters according to different membership degrees. Thus, given a data set x= { X containing n elements 1 ,x 2 ,x 3 ,…,x n If it needs to be decomposed into c fuzzy clusters, then FCM is to minimize the objective function, which is:
wherein m is any real number greater than 1; u (u) ij Is x i Membership to the jth cluster; x is x i Is the i-th element in the set X, and the dimension is d dimension; c j Is the center of the j-th cluster; ii is an algorithm that calculates the distance of the data from the cluster center.
Here, FCM optimizes membership u by performing continuous iteration on the objective function ij And cluster center c j To accomplish this, the iterative expression is:
let ε be the threshold of the iterative process when it is satisfied thatWhen the iteration process is finished, the above-mentioned processIs considered to converge to a local minimum point J.
In this embodiment, the threshold is set to epsilon=0.5, and the category is set to c=5.
(2) Input data normalization
Because of dimensional differences among different dimensions of input data, the model training and prediction effects are affected, normalization processing is adopted on the data, and the data is mapped to the [ -1,1] interval. The formula is as follows:
in which x is max 、x min The maximum and minimum values of the variables, respectively.
Step 3: input to the residual LSTM module: the input and output are added together and then activated by the Dropout function.
For LSTM networks, identity mapping is not easily fit. The basic idea of the residual network is therefore introduced in the present invention to solve this problem. The residual unit may be implemented as a layer-jump connection, i.e. the input data X and the LSTM output data F (X) are added together to obtain y=f (X) +x, which is then activated. The Dropout function is used as the activation function in the present invention. LSTM incorporating the residual mechanism can be easily implemented with the mainstream automatic differential deep learning framework, directly using BP algorithm to update parameters, as shown in fig. 2.
Experiments show that the residual error mechanism well solves the degradation problem of the LSTM network.
Step 4: attention layer dot product operation:
the data is input into the Encoder-Decoder model, which incorporates the Scaled Dot-Product Attention mechanism, such that each time instant of the Decoder depends on the weight of the element inside the intermediate code on the output. The invention highlights the impact of key factors by introducing Scaled Dot-Product Attention into the Encoder-Decoder model.
Wherein the Attention is added to the intermediate result of the Decoder; two LSTM layers are included in the Decoder, and the position of the addition of the Attention is between the two LSTM hidden layers, and the Attention is given to the second LSTM layer together with the output of the first LSTM layer.
The equation for the Attention mechanism is as follows:
q represents a query term, K represents a calculated attention term, V represents a value of the calculated attention term, and d represents a value of the calculated attention term k As a normalization, d k Representing the dimension of the calculated attention item K, 64 is used by default.
Determining the weight distribution of the value V of the calculated attention item by inquiring the similarity degree of the term Q and the calculated attention item K, and dividing the weight distribution by a scaling factor to reduce the weight distribution at d k The dot product results in large dimensions have a large impact.
The reason for using the scaling factor is that for d k When large, the dot product results in a large dimension, resulting in a region where the softmax function gradient at the result is small. And in the case of small gradients, counter-propagation is not favored. To overcome this negative effect, divided by a scaling factor, this situation can be alleviated to some extent.
For the traditional model, the hidden layer is absent in the encoding process, and in the invention, the hidden layer is replaced by the sequence data after data preprocessing to perform the operation of the attention mechanism. So Q this matrix is made up of sequences of payload data, as is K, V. When a payload data sequence is used to "query" its degree of match with any payload data sequence, i.e., the magnitude of attention, a total of n rounds of such operations are performed.
The dimension of the load sequence data is d, the size of Q is n×d, K, V is the same, and the ith load sequence data is v i The Attention of this sequence should be:
if all sequences of attitudes are done at the same time, attitudes are:
step 5: ffnn+softmax optimization: after the Attention layer, an FFNN+Softmax layer is introduced for optimization, the result obtained by calculation of the Attention layer is transmitted to the FFNN+Softmax, the precision of load prediction is further improved, and the FFNN network is trained by a Levenberg-Marquardt algorithm.
The Attention after adding FFNN may be formalized as:
e t =a(h n )
wherein a (h n ) The training function can be regarded as a feed forward network, and the formula function can refer to the prior art and is not repeated.
For better optimization of the depth network, a residual connection and a peer (Layer Normalization) are added after the FFNN network (Add & Norm).
Note that a change in one layer output will produce a high correlation change in the next layer input, especially when its output changes significantly. The effect of covariate shift can be reduced by fixing the input mean and variance of a layer of neurons. Layer normalization is thus introduced to reduce the effect of changes in the output of this layer on the input of the next layer.
In summary, the present invention provides a short-term load prediction method in home power consumption environment combined with Scaled Dot-Product Attention based on LSTM, which uses fuzzy clustering algorithm (FCM) to extract similar daily data, normalizes the data, and inputs the data to an Encoder-Decoder model combined with Scaled Dot-Product Attention for solving the problem of correlation loss between long input sequence information and sequences, so that the weights of the elements in the intermediate codes on which each moment of the Encoder depends on output are different. The results show that the invention has the main advantages that:
(1) The method extracts the data of similar days and normalizes the data, solves the problems of large similarity and non-uniform dimension between the data, and improves the accuracy of short-term load prediction.
(2) The relevance between the data is processed, the influence of the key factors is highlighted by introducing Scaled Dot-Product Attention into the Encoder-Decoder model, the global relation is captured in one step, and the problem of long-distance dependence is solved.
(3) After the Attention layer, an FFNN+Softmax layer is introduced for optimization, so that the predicted result is more accurate.
It should be understood that the above description is not intended to limit the invention to the particular embodiments disclosed, but to limit the invention to the particular embodiments disclosed, and that various changes, modifications, additions and substitutions can be made by those skilled in the art without departing from the spirit and scope of the invention.
Claims (2)
1. A short-term load prediction method for household electricity is characterized in that a residual error mechanism is introduced into an LSTM network to construct a residual error LSTM module, and a Scaleddot-Product Attention mechanism is introduced into a decoding process to construct an Encoder-Decoder model, and the method comprises the following steps:
step 1: acquiring historical load data;
step 2: data preprocessing: extracting similar day data through an FCM fuzzy clustering algorithm, and carrying out normalization processing on the data; the specific implementation steps are as follows:
(1) Similar day data extraction: given a dataset x= { X containing n elements 1 ,x 2 ,x 3 ,…,x n -it needs to be decomposed into c fuzzy clusters, minimizing the objective function as:
wherein m is any real number greater than 1; u (u) ij Is x i Membership to the jth cluster; x is x i Is the i-th element in the set X, and the dimension is d dimension; c j Is the center of the j-th cluster; ii is an algorithm that calculates the distance of the data from the cluster center;
FCM is a method for optimizing membership u by continuously iterating the objective function ij And cluster center c j To accomplish this, the iterative expression is:
let ε be the threshold of the iterative process when it is satisfied thatAt the end of the iterative process, the process is considered to converge to a local minimum point J;
(2) Input data normalization
The normalization processing is adopted for the data, the data is mapped to the [ -1,1] interval, and the formula is as follows:
in which x is max 、x min Respectively the maximum value and the minimum value of the variable;
step 3: input to the residual LSTM module: adding the input data X and the LSTM output data F (X) together to obtain F (X) +X, and activating the F (X) +X through a Dropout function;
step 4: attention layer dot product operation: data is input into an Encoder-Decode model which combines a Scaled Dot-Product Attention mechanism, so that the weights of elements inside intermediate codes relied on by each moment of the Decode are different; wherein the Attention is added to the intermediate result of the Decoder; in the Encoder-Decoder model of the step 4, if two LSTM layers exist in the Decoder, the added position of the attribute is between the hidden layers of the LSTM layers, and the attribute and the output of the LSTM of the first layer are given to the LSTM of the second layer together;
the equation of the Attention mechanism is as follows:
q represents a query term, K represents a calculated attention term, V represents a value of the calculated attention term, and d represents a value of the calculated attention term k As a normalization, d k Representing the dimension of the calculated attention item K; determining the weight distribution of the value V of the calculated attention item by inquiring the similarity degree of the term Q and the calculated attention item K, and dividing the weight distribution by a scaling factor to reduce the weight distribution at d k When the dot product is large, the influence caused by the large dimension of the result is large;
the dimension of the load sequence data is d, the size of Q is n×d, the size of K, V is n×d, and the ith load sequence data is v i The Attention of this sequence should be:
and simultaneously making the attitudes of all sequences, wherein the attitudes are as follows:
step 5: ffnn+softmax optimization: after the Attention layer, an FFNN+Softmax layer is introduced for optimization, the result obtained by calculation of the Attention layer is transmitted to the FFNN+Softmax, and the FFNN is trained by a Levenberg-Marquardt algorithm.
2. The household electricity oriented short-term load prediction method according to claim 1, wherein after the FFNN network of step 5, a residual connection and a peer layer are added to fix the mean and variance of the inputs of neurons of one layer, and reduce the influence of the change of the output of the peer layer on the input of the next peer layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111045733.3A CN113743668B (en) | 2021-09-07 | 2021-09-07 | Household electricity-oriented short-term load prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111045733.3A CN113743668B (en) | 2021-09-07 | 2021-09-07 | Household electricity-oriented short-term load prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113743668A CN113743668A (en) | 2021-12-03 |
CN113743668B true CN113743668B (en) | 2024-04-05 |
Family
ID=78736806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111045733.3A Active CN113743668B (en) | 2021-09-07 | 2021-09-07 | Household electricity-oriented short-term load prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113743668B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931989A (en) * | 2020-07-10 | 2020-11-13 | 国网浙江省电力有限公司绍兴供电公司 | Power system short-term load prediction method based on deep learning neural network |
AU2020104000A4 (en) * | 2020-12-10 | 2021-02-18 | Guangxi University | Short-term Load Forecasting Method Based on TCN and IPSO-LSSVM Combined Model |
CN113052469A (en) * | 2021-03-30 | 2021-06-29 | 贵州电网有限责任公司 | Method for calculating wind-solar-water-load complementary characteristic of small hydropower area lacking measurement runoff |
US11070056B1 (en) * | 2020-03-13 | 2021-07-20 | Dalian University Of Technology | Short-term interval prediction method for photovoltaic power output |
-
2021
- 2021-09-07 CN CN202111045733.3A patent/CN113743668B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11070056B1 (en) * | 2020-03-13 | 2021-07-20 | Dalian University Of Technology | Short-term interval prediction method for photovoltaic power output |
CN111931989A (en) * | 2020-07-10 | 2020-11-13 | 国网浙江省电力有限公司绍兴供电公司 | Power system short-term load prediction method based on deep learning neural network |
AU2020104000A4 (en) * | 2020-12-10 | 2021-02-18 | Guangxi University | Short-term Load Forecasting Method Based on TCN and IPSO-LSSVM Combined Model |
CN113052469A (en) * | 2021-03-30 | 2021-06-29 | 贵州电网有限责任公司 | Method for calculating wind-solar-water-load complementary characteristic of small hydropower area lacking measurement runoff |
Also Published As
Publication number | Publication date |
---|---|
CN113743668A (en) | 2021-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113962364A (en) | Multi-factor power load prediction method based on deep learning | |
CN113177633B (en) | Depth decoupling time sequence prediction method | |
CN115688579A (en) | Basin multi-point water level prediction early warning method based on generation of countermeasure network | |
CN111079989A (en) | Water supply company water supply amount prediction device based on DWT-PCA-LSTM | |
CN114358389B (en) | Short-term power load prediction method combining VMD decomposition and time convolution network | |
CN114757427B (en) | Autoregressive-corrected LSTM intelligent wind power plant ultra-short-term power prediction method | |
CN113554466A (en) | Short-term power consumption prediction model construction method, prediction method and device | |
CN113487855B (en) | Traffic flow prediction method based on EMD-GAN neural network structure | |
CN116526450A (en) | Error compensation-based two-stage short-term power load combination prediction method | |
CN115115125B (en) | Photovoltaic power interval probability prediction method based on deep learning fusion model | |
CN114117852B (en) | Regional heat load rolling prediction method based on finite difference working domain division | |
CN116090602A (en) | Power load prediction method and system | |
CN112330052A (en) | Distribution transformer load prediction method | |
CN115343784A (en) | Local air temperature prediction method based on seq2seq-attention model | |
CN115689001A (en) | Short-term load prediction method based on pattern matching | |
CN115471008A (en) | Gas short-term load prediction method based on long-term and short-term memory neural network | |
CN113743668B (en) | Household electricity-oriented short-term load prediction method | |
CN111754033B (en) | Non-stationary time sequence data prediction method based on cyclic neural network | |
CN109214610A (en) | A kind of saturation Methods of electric load forecasting based on shot and long term Memory Neural Networks | |
CN117077008A (en) | Air temperature prediction method based on sparse attention and self-adaptive time sequence decomposition | |
CN109034453A (en) | A kind of Short-Term Load Forecasting Method based on multiple labeling neural network | |
CN115330050A (en) | Building load prediction method based on hybrid model | |
CN115081551A (en) | RVM line loss model building method and system based on K-Means clustering and optimization | |
CN111798275B (en) | Domestic flight price prediction method | |
CN113128754A (en) | GRU neural network-based residential water use prediction system and prediction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |