CN115238952A - Bi-LSTM-Attention short-term power load prediction method - Google Patents
Bi-LSTM-Attention short-term power load prediction method Download PDFInfo
- Publication number
- CN115238952A CN115238952A CN202210675542.3A CN202210675542A CN115238952A CN 115238952 A CN115238952 A CN 115238952A CN 202210675542 A CN202210675542 A CN 202210675542A CN 115238952 A CN115238952 A CN 115238952A
- Authority
- CN
- China
- Prior art keywords
- lstm
- model
- attention
- data
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 241000283153 Cetacea Species 0.000 claims abstract description 28
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 28
- 238000005457 optimization Methods 0.000 claims abstract description 27
- 230000007246 mechanism Effects 0.000 claims abstract description 20
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000003062 neural network model Methods 0.000 claims abstract description 7
- 230000002441 reversible effect Effects 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 abstract 1
- 238000013528 artificial neural network Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 11
- 238000011156 evaluation Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 230000001186 cumulative effect Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000005265 energy consumption Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 3
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005315 distribution function Methods 0.000 description 3
- 238000004134 energy conservation Methods 0.000 description 3
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 238000013210 evaluation model Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 241000269586 Ambystoma 'unisexual hybrid' Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013502 data validation Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229910052731 fluorine Inorganic materials 0.000 description 1
- 125000001153 fluoro group Chemical group F* 0.000 description 1
- 238000013277 forecasting method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
- H02J3/003—Load forecast, e.g. methods or systems for forecasting future load demand
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2203/00—Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
- H02J2203/20—Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Biophysics (AREA)
- Tourism & Hospitality (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Marketing (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Development Economics (AREA)
- Quality & Reliability (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Power Engineering (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a short-term power load prediction method based on a Bi-LSTM-Attention model, which is characterized in that time sequence historical load data and weather information data are used as input, a Bi-LSTM neural network model is used for carrying out bidirectional cyclic training, the positive and negative laws of the load data are learned, an Attention mechanism is introduced on the basis of the model, and the importance degree of different characteristics to a prediction model is highlighted by distributing weights for the characteristics; meanwhile, aiming at the Bi-LSTM-Attention model, the optimized selection of the model hyper-parameters is realized through the improved whale optimization algorithm, the performance of the prediction model is further improved, and in addition, the local optimization capability of the algorithm is improved through the self-adaptive weight method. The method has higher prediction precision compared with other models.
Description
Technical Field
The invention belongs to the technical field of energy conservation, relates to application of a computer in an energy conservation technology, and particularly relates to a short-term power load prediction method based on a Bi-LSTM-Attention model.
Background
With the development of various energy saving technologies, accurate load prediction plays an increasingly important role in energy conservation management. In recent years, attention has been paid to load prediction techniques. Typically, load forecasts include long-term load forecasts (LTLF) for loads over one year, medium-term load forecasts (MTLF) for loads from several weeks to one year, short-term load forecasts (STLF) for loads from one day to one week, and very short-time load forecasts (VSTLF) for loads from minutes to hours [1] . The LTLF and the MTLF can estimate the change trend of the load, and are suitable for long-term planning of the system in the design stage. STLF and VSTLF can generate accurate control and scheduling load requirements, better suited for short-term control of existing systems.
Load prediction is a type of study on time series prediction that has begun earlier in statistics and computer science. These methods have evolved from traditional statistical methods to today's artificial intelligence based models or hybrid models.
The most common models used in time series prediction are autoregressive models, moving average models, autoregressive integrated moving average models, seasonal integrated autoregressive moving average models. These models and methods focus on univariate data with linear relationships and time dependencies, which makes it less effective for time series with non-linear characteristics. The load belongs to a time series type with non-linear characteristics, and the load prediction is influenced by various random factors including weather conditions, time information and behavior of residents [2~4] And the like.
In recent years, with the rapid development of deep learning, a prediction model mainly based on a Recurrent Neural Network (RNN) has attracted much attention in processing time series data. Notably, long-short memory (LSTM) networks [5] Is proposed to promoteThe development of RNN effectively relieves the problems of gradient explosion and gradient disappearance existing in RNN by adding a gate control unit [6] . LSTM can identify the structure and pattern of data in time series prediction, such as non-linearity and complexity, and thus can predict complex time series with strong non-linearity. Literature reference [7] The LSTM is used for energy consumption prediction, and compared with a BP neural network, the LSTM has higher prediction accuracy. Marino et al attempted to use the LSTM method [8] Solve the same load prediction problem and show the literature [9] Similar results. While LSTM has many advantages in processing complex non-linear data, it also has its limitations. LSTM is more complex and more difficult to train, and in some cases does not perform as well as the simple ARIMA model [10] . To improve their performance, more and more researchers have improved predictive models by combining LSTM with traditional methods or other machine learning methods. For example, cai et al [11] Two deep learning models (RNN and CNN) were used for multi-step load prediction with the ARIMA method and compared. The result shows that the prediction precision of the deep learning-based model is improved by 22.6 percent compared with that of an ARIMA model.
Different types of sequence data tend to have different characteristics, which can have a large impact on the choice of predictive models, the settings of the model parameters, and the accuracy of the results. In conventional studies, the study data for load prediction is generally based on weather information, time information, and historical loads [12~15] . The bidirectional LSTM (bidirectional LSTM) which is established in recent years is a combination of forward LSTM and backward LSTM, and data can be fitted from the forward direction and the backward direction of the sequence to achieve higher prediction accuracy [16] . The attention mechanism is a method for keeping important information in different input characteristics in model training through weight distribution, improves the characteristic extraction capability of data, and can effectively improve the accuracy of power daily load prediction [17] 。
The following are relevant references that applicants have searched for and that these references are to be used in the present invention.
【1】Singh P,Dwivedi P.Integration of new evolutionary approach with artificial neural network for solving short term load forecast problem[J].Applied energy,2018,217:537-549。
【2】Khatoon S,Singh A K.Effects of various factors on electric load forecasting:An overview[C]//Proc of the 6th IEEE Power India International Conference(PIICON).Piscataway,NJ:IEEE Press,2014:1-5。
【3】Walter T,Price P N,Sohn M D.Uncertainty estimation improves energy measurement and verification procedures[J].Applied Energy,2014,130:230-236。
【4】Yan D,O’Brien W,Hong T,et al.Occupant behavior modeling for building performance simulation:Current state and future challenges[J].Energy and buildings,2015,107:264-278。
【5】Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural computation,1997,9(8):1735-1780。
【6】Vermaak J,Botha E C.Recurrent neural networks for short-term load forecasting[J].IEEE Trans on Power Systems,1998,13(1):126-132。
【7】 Zhang Ting Fei, luo Heng, liu Hang building energy consumption prediction method based on LSTM network [ J ]. Proceedings of Suzhou university of science and technology: nature's edition, 2020,37 (04): 78-84.
【8】Marino D L,Amarasinghe K,Manic M.Building energy load forecasting using deep neural networks[C]//Proc of the 42nd Annual Conference of the IEEE Industrial Electronics Society.Piscataway,NJ:IEEE Press,2016:7046-7051。
【9】Mocanu E,Nguyen P H,Gibescu M,et al.Deep learning for estimating building energy consumption[J].Sustainable Energy,Grids and Networks,2016,6:91-99。
【10】Makridakis S,Spiliotis E,Assimakopoulos V.Statistical and Machine Learning forecasting methods:Concerns and ways forward[J].PloS one,2018,13(3):e0194889。
【11】Cai M,Pipattanasomporn M,Rahman S.Day-ahead building-level load forecasts using deep learning vs.traditional time-series techniques[J].Applied energy,2019,236:1078-1088。
【12】Zhang J,Wei Y M,Li D,et al.Short term electricity load forecasting using a hybrid model[J].Energy,2018,158:774-781。
【13】Jain R K,Smith K M,Culligan P J,et al.Forecasting energy consumption of multi-family residential buildings using support vector regression:Investigating the impact of temporal and spatial monitoring granularity on performance accuracy[J].Applied Energy,2014,123:168-178。
【14】Amber K P,Aslam M W,Hussain S K.Electricity consumption forecasting models for administration buildings of the UK higher education sector[J].Energy and Buildings,2015,90:127-136。
【15】Grolinger K,L’Heureux A,Capretz M A M,et al.Energy forecasting for event venues:Big data and prediction accuracy[J].Energy and buildings,2016,112:222-233。
【16】Wu K,Wu J,Feng L,et al.An attention-based CNN-LSTM-BiLSTM model for short-term electric load forecasting in integrated energy system[J].International Transactions on Electrical Energy Systems,2021,31(1):e12637。
【17】 Zhao Bing, wang Zengping, ji Weijia, et al, CNN-GRU short term power load prediction method based on attention mechanism [ J ] grid technology, 2019, 43 (12): 4370-4376.
【18】Graves A,Jaitly N,Mohamed A.Hybrid speech recognition with deep bidirectional LSTM[C]//Proc of IEEE workshop on automatic speech recognition and understanding.Piscataway,NJ:IEEE Press,2013:273-278。
【19】Graves A,Schmidhuber J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural networks,2005,18(5-6):602-610。
【20】Wang Y,Huang M,Zhu X,et al.Attention-based LSTM for aspect-level sentiment classification[C]//Proc of EMNLP.Stroudsburg:ACL Press,2016:606-615。
【21】Mirjalili S,Lewis A.The whale optimization algorithm[J].Advances in engineering software,2016,95:51-67。
【22】Schuster M,Paliwal K.K.Bidirectional recurrent neural networks[J].IEEE Trans on Signal Processing,1997,45(11):2673-2681。
Disclosure of Invention
The invention aims to provide a short-term power load prediction method based on a Bi-LSTM-Attention model, aiming at the problems that power loads have high volatility and uncertainty and the traditional load prediction method has limitations when nonlinear time series data are processed.
In order to realize the task, the invention adopts the following technical solution:
a short-term power load prediction method based on a Bi-LSTM-Attention model is characterized in that time sequence historical load data and weather information data are used as input, bidirectional circulation training is carried out by using the Bi-LSTM neural network model, the positive and reverse laws of the load data are learned, an Attention mechanism is introduced on the basis of the model, and importance degrees of different features to the prediction model are highlighted by distributing weights for the features; meanwhile, aiming at the Bi-LSTM-Attention model, optimized selection of model hyper-parameters is achieved through an improved whale optimization algorithm, the performance of the prediction model is further improved, and in addition, the local optimization capability of the algorithm is improved through a self-adaptive weight method.
According to the invention, the Bi-LSTM neural network model comprises an input layer, an embedding layer, a forward LSTM hidden layer, a reverse LSTM hidden layer, an attention mechanism layer, a full connection layer and an output layer; after the Bi-LSTM neural network model receives input information, time sequence data are transmitted into hidden layers of forward LSTM and backward LSTM, and the hidden layers are combined to output processed vectors. And the attention mechanism layer takes the data processed by the bidirectional LSTM as input, calculates the attention weight of the data, then uses normalization processing, and finally combines the weight vector with the corresponding characteristic at the current moment to obtain the output of characteristic attention.
Compared with other models, the short-term power load prediction method based on the Bi-LSTM-Attention model has higher prediction precision, and brings technical innovation that:
1) Before the effect of the model is verified, periodic analysis and bidirectional information flow verification are carried out on the load data, and the conclusion that the LSTM is reasonable to use and the current time data are influenced by past and future data is obtained. And then, standardizing the data, and establishing an evaluation index of the evaluation model.
2) After the Bi-LSTM model is constructed and the Attention is introduced, the experimental result verifies that the bidirectional network and the Attention mechanism have positive influence on the accuracy of power load prediction.
3) In the WOAWC-Bi-LSTM-Attention model, aiming at the problem of difficulty in selecting the super-parameters of the network, a group of super-parameters is found by using an improved whale optimization algorithm, so that the mean square error of the Bi-LSTM-Attention model is minimum. The experimental result shows that the evaluation indexes of the optimized WOAWC-Bi-LSTM-orientation model are reduced compared with those of the prior model, and the determination coefficient is closest to 1.
Drawings
FIG. 1 is a diagram of an LSTM network architecture;
FIG. 2 is a diagram of a Bi-LSTM neural network architecture;
FIG. 3 is a schematic view of the attention mechanism;
FIG. 4 is a diagram of the structure of the Bi-LSTM-Attention model;
FIG. 5 is a flow chart of the WOAWC optimized Bi-LSTM-orientation;
FIG. 6 is a graph of the training process for each model, wherein (a) the graph is the LSTM loss function; (b) the graph is a BilSTM loss function; (c) the graph is a BilSTM-AT loss function; (d) a WOAWC-Bi-LSTM-AT loss function;
FIG. 7 is a one week load trend graph;
FIG. 8 is the autocorrelation coefficients of the forward and reverse sequences;
FIG. 9 is a fitness graph;
FIG. 10 is a different hyper-parameter optimization process;
fig. 11 is a comparison of prediction results.
FIG. 12 is a comparison of the results of the WOA and WOAWC optimization models for the native algorithm.
The present invention will be described in further detail with reference to the following drawings and examples.
Detailed Description
The embodiment provides a short-term power load prediction method based on a Bi-LSTM-Attention model, which mainly takes historical data of loads as input and considers the influence of outdoor temperature, relative humidity and time information. The BilSTM neural network learns the change rule of time sequence data, and combines the attention mechanism to highlight the influence of key features and distributes attention weight to carry out deep mining on the rule of load data. Meanwhile, aiming at the Bi-LSTM-Attention model, the improved whale optimization algorithm is used for optimizing the super-parameters of the model, so that the prediction performance is further improved. The experimental result shows that compared with an LSTM model, a Bi-LSTM model and a Bi-LSTM-Attention model, the model has higher prediction precision, and the error indexes MAPE, RMSE and MAE are all obviously reduced.
The specific implementation is as follows.
1. BilsTM-Attention prediction model
1.1 LSTM neural network
LSTM is a highly efficient RNN structure proposed by Hocherier and Schmidhuber in 1997 [18] . As in fig. 1, the top row of lines is a status cell, which refers to an internal memory. The lines across the bottom are the hidden layer states, and the gating cells f, i, o, and g are designed to solve the gradient vanishing problem. In network training, each gate learns the weights and biases separately. Where a forgetting gate helps the LSTM decide which information to discard from the state cell, the amount that can be adjusted through the previous hidden layer state. The input gates determine how much new information to store in the state cells and the output gates adjust the amount of hidden layer states in the next sequence. The LSTM network corresponding parameters are calculated as follows:
f t =σ(W fx x t +W fh h t-1 +b f ) (1)
i t =σ(W ix x t +W ih h t-1 +b i ) (2)
g t =σ(W gx x t +W gh h t-1 +b g ) (3)
o t =σ(W ox x t +W oh h t-1 +b o ) (4)
c t =g t ⊙i t +c t-1 ⊙f t (5)
h t =φ(c t )⊙o t (6)
in the formula (f) t ,i t ,o t ,c t The states of the forgetting gate, the input gate, the output gate and the state unit at the current time t are respectively; x is the number of t Inputting at the current time t; h is t-1 The previous time is the hidden layer state; g t For internal hidden layer states, based on x t And h t-1 Calculating to obtain; w fx ,W fh ,W ix ,W ih ,W gx ,W gh ,W ox ,W oh And b f ,b i ,b g ,b o Respectively corresponding weight matrix and bias item; σ (·), φ (·) represents the Sigmoid and tanh activation functions, respectively; as indicates a hadamard product.
1.2Bi-LSTM neural network
There is only one counterpropagating LSTM in the LSTM, which makes it possible to fit time-dependent data from only one direction when processing the data. Graves [19] A bidirectional LSTM is provided on the basis of the LSTM. Unlike unidirectional LSTM, the Bi-LSTM neural network adds a layer of inverse LSTM. The reverse LSTM performs reverse processing on the time sequence data, the hidden layer fuses forward information and reverse information, so that the network can effectively learn more time sequence data information, and the Bi-LSTM neural network structure is shown in figure 2.
The backward LSTM is computed in a similar manner to the forward LSTM, and information for subsequent time series is obtained only in the reverse direction. The calculation formula of the Bi-LSTM network is as follows:
h f =f(W f1 x t +W f2 h t-1 ) (7)
h b =f(W b1 x t +W b2 h t+1 ) (8)
wherein h is f Is the output of the forward LSTM network, h b Is the output of the inverse LSTM network, the final output of the hidden layer is:
y i =g(W o1 ⊙h f +W o2 ⊙h b ) (9)
1.3 Mechanism of Attention
The Attention mechanism is a probabilistic weighting mechanism that mimics human brain Attention [20] When the human brain observes things, it focuses on a specific place and ignores others, and the attention mechanism highlights more important features by assigning different probability weights to the inputs, thereby improving the accuracy of the model. Therefore, the BilSTM neural network combined with the Attention mechanism predicts the load, and can avoid the influence of complex features in the data, and the structure of the load is shown in FIG. 3.
In the figure, the value of the input sequence is x 1 To x n The value of the hidden layer state is h 1 To h n And alpha represents the attention weight of the hidden layer to the current input, and the calculation formula is as follows:
e t =u tanh(wh t +b) (11)
in the formula: e.g. of the type t Output h from LSTM layer for time t t The determined attention probability assignment; u and w are weight coefficients; b is a bias term; c. C t Is the output of the Attention layer at time t.
1.4 Bi-LSTM-Attention model
The Bi-LSTM-Attention model comprises an input layer, an embedding layer, a forward LSTM hidden layer, a reverse LSTM hidden layer, an Attention mechanism layer, a full connection layer and an output layer, and the structure of the Bi-LSTM-Attention model is shown in FIG. 4.
After the Bi-LSTM-Attention model receives input information, time sequence data are transmitted into hidden layers of forward LSTM and backward LSTM, and the hidden layers are combined to output processed vectors. The attention mechanism layer takes the bi-directional LSTM processed data as input, calculates its attention weight, and then uses a normalization process. And finally, combining the weight vector with the corresponding feature at the current moment to obtain the output of feature attention.
1.5 WOAWC optimized Bi-LSTM-Attention model
Whale Optimization Algorithm (WOA) is a novel group intelligent Optimization Algorithm proposed by Australian scholars Mirialili et al in 2016 [21] The method is a meta-heuristic algorithm for simulating the predation behavior of the whale population in nature. The method has the characteristics of simple principle, few set parameters, strong global search capability and the like, and has proved to be superior to the PSO algorithm in the aspects of processing the optimization of continuous functions on the aspects of solving precision and convergence speed, but the problems that the WOA algorithm is easy to fall into local optimum and the convergence precision is low still exist. In the embodiment, the positions of the whales are varied through an improved Whale Optimization Algorithm (WOAWC), the global search capability of the algorithm is improved, and in addition, the local optimization capability of the algorithm is improved through a self-adaptive weight method.
Further, the WOAWC principle is as follows:
(1) When the whale optimization algorithm is used for global search of the population, one whale needs to be randomly selected as a reference, so that other whales are randomly selected to be close to the reference whale. The selection of the reference whale in the original WOA algorithm is random, and the optimization of the algorithm on the global optimal solution is influenced. In this embodiment, the whale is mutated by using the cauchy inverse cumulative distribution function, so that individual whales are mutated to a wider range.
The Cauchi inverse cumulative distribution function formula is as follows:
when the whale carries out Cauchi adverse cumulative distribution variation, local optimization can be carried out by adopting a spiral wandering mode, so that the blind variation of the whale is avoided, and the formula is as follows:
in the formula (I), the compound is shown in the specification,is a vector of coefficients that is a function of,is a vector of the position of the object,is a whale individual position vector randomly selected from the current population.
Using the Cauchy variation to rewrite the formulas (14) and (15):
in the formula, F -1 Is the inverse cumulative distribution function of the Cauchy distribution, x ij Is j position points of the ith whale before mutation, and r belongs to [0,1 ]]Is uniformly distributed.
(2) When whales are subjected to local optimization, an individual of the whales in WOA, which surrounds the catching stage and is closest to food, is equivalent to a current local optimal solution, and a formula when the other individuals approach to the optimal solution is as follows:
the invention provides a method for changing the position of an optimal whale individual at the moment and improving the local optimizing capability of the whale by adopting a self-adaptive weight method, wherein the self-adaptive weight formula is as follows:
where t is the current iteration number, t max The maximum number of iterations is indicated. Introducing adaptive weights to equation (17):
in order to improve the performance of the model, the network structure and the optimization parameters need to be optimized, and reasonable parameters are set so that the model can be converged to the global minimum value quickly. In this embodiment, 6 parameters including a learning rate (L), a training number (N), a batch size (B), a first hidden layer node number (H1), a second hidden layer node number (H2), and a full connection layer node number (F) of the wlst-Attention network model are optimized through the WOAWC, and meanwhile, a parameter search range is set to be limited to L e [0.001,0.01], N e [10, 100], B e [16, 128], H1 e [1, 128], H2 e [1, 128], and F e [1, 100] to prevent the search space from being too large to affect the optimization efficiency, and finally, the optimized WOAWC-Bi-LSTM-Attention model is verified, and a flow chart thereof is shown in fig. 5, and includes the following steps:
firstly, acquiring data, preprocessing the data, and then dividing a data set into a training set and a test set; wherein the training set enters a WOAWC-Bi-LSTM-orientation model for training, and the testing set enters the WOAWC-Bi-LSTM-orientation model for testing;
the WOAWC encodes an initial value, performs population initialization after calculating a fitness value, then performs update by the WOAWC population, and performs global optimal solution update after calculating the fitness value; if the condition is met, outputting the optimal network parameters, and if the condition is not met, returning to the WOAWC population updating step;
after calculating the fitness value and performing a population initialization step, the input parameters enter a WOAWC-Bi-LSTM-orientation model, the WOAWC decodes the input parameters to obtain corresponding 6 super-parameter values, and the WOAWC-Bi-LSTM-orientation model returns the fitness value;
after calculating the fitness value and carrying out the global optimal solution updating step, the input parameters enter a Bi-LSTM-orientation model, and the fitness value is transmitted back by the WOAWC-Bi-LSTM-orientation model.
1.6 loss function
In the embodiment, the training process of the prediction model is optimized by using an Adam algorithm, and the loss function is a mean square error function, that is:
in the formula: n is the number of samples; y is i Andthe true value and the predicted value of the sample point. Load prediction was performed at 96 moments in conjunction with the study herein, so n =96. The loss function curve for each model training process is shown in fig. 6.
2. Example analysis
2.1 data Source and Pre-processing
In modern power systems, meteorological factors have increasingly significant influence on the load of the power system. Therefore, considering meteorological factors becomes one of the main means for the dispatch center to further improve the load prediction accuracy.
In the embodiment, the provided prediction method is verified by selecting a short-term load value all the year 2014 in a certain region provided by a public data set on a website, wherein the short-term load value comprises time information, weather information and a load value. One day of the data is divided into 96 time points (sampled once at 15 min), and modeling is carried out by constructing a rolling sequence, namely all values from day 1 to day n are input, 96 load values from day n +1 are output, all values from day 2 to day n +1 are input, 96 load values from day n +2 are output, and the like, so that the multi-input multi-output load prediction is constructed.
Different evaluation indexes generally have different dimensions and units, and the situation can influence the data analysis result. Therefore, to eliminate this difference, before training and verifying the data, the indexes are in the same order, and it is necessary to normalize the data and process the data by Min-Max method.
In the formula: x is the raw data; x is normalized data; x is a radical of a fluorine atom min 、x max The minimum and maximum values of the data, respectively. Mapping the normalized data to [ -1,1]An interval.
2.2 data validation
For the rigor of the experiment, the prediction method is verified by combining the experimental data of the power load. Fig. 7 shows the load trend during the week, with one day of the data divided into 96 time points. As can be seen from the figure, the load data fluctuates according to a certain frequency, the whole system has periodicity, and the LSTM method is reasonable to select.
Compared with the traditional LSTM neural network, the Bi-LSTM neural network considers the internal rules of forward and backward data at the same time, and develops prediction from history and future directions [22] . Therefore, the load prediction is performed while considering the influence of the historical load and the future load on the prediction accuracy.
In order to verify that the load data has bidirectional information flow, load data of a month in a data set is selected, the load data is divided into a forward load sequence and a reverse load sequence along the center of the month data in a forward direction and a reverse direction, autocorrelation coefficients of the two sequences are respectively calculated, and as can be seen from fig. 8, a time sequence of the load has obvious forward and reverse laws.
2.3 evaluation index
To evaluate the performance of the prediction model, the error indicators used in this embodiment are: mean percent error MAPE, mean absolute error MAE, root mean square error RMSE, coefficient of determination R 2 The expressions are respectively as follows:
in the formula: n is the total number; y is i Andthe real value and the predicted value of the ith sample point are respectively.
2.4 prediction results and comparative analysis
The experimental computer is configured as a Windows 10-bit operating system, and the GPU is NVIDIA GeForce RTX 2070s 8G. The programming software used was python3.7 and the compilation environments were tensoflow2.6.0 and keras2.2.4.
The model training set is the input data at 364 days before 2014, and the test set is the load data at the last two days. As shown in FIG. 9, the adaptation curve of the model WOAWC-Bi-LSTM-Attention is finally stabilized to 0.0027 by improving the iterative optimization of the Whale Optimization Algorithm (WOAWC). The 6-hyperparameter iteration is shown in FIG. 10, and the final stable values are shown in Table 1. In the comparative experiment of this embodiment, the BPNN model, the LSTM model, the BiLSTM model and the BiLSTM-Attention model are selected for load prediction, and the prediction results are shown in fig. 11. Fig. 12 shows a comparison graph of the results of the WOA optimized network model and the WOAWC optimized network model of the original algorithm.
The result of FIG. 11 shows that the load value after the WOAWC is optimized to the hyperparameter of the BilSTM-Attention network is best fitted, and the prediction result is closer to the true value. The evaluation index of the predictive performance of each model is shown in Table 2.
Table 1: result of parameter optimization
Parameter(s) | Best results |
Learning rate | 0.00552 |
Number of training sessions | 98 |
Batch size | 40 |
Number of nodes of first |
100 |
Number of nodes of second hidden layer | 74 |
Number of full connection layers | 61 |
Table 2: comparison of prediction errors of different models
As is evident from Table 2, the LSTM-based prediction model performed better than the BP model in time series data prediction. The two-way LSTM model is superior to the one-way LSTM model, indicating that BiLSTM can better find features in the sequence. The errors of MAPE, RMSE and MAE of the Bi-LSTM model added with the Attention are respectively reduced by 2.73%, 5.7% and 12.42% compared with the errors of the Bi-LSTM model, and the Attention improves the prediction effect on the excavation of different feature contribution degrees. After 6 kinds of hyper-parameters of the network model are optimized by using the improved whale optimization algorithm, the prediction performance of the model is further improved, and R 2 The value reaches above 0.99.
3. Conclusion
Aiming at the requirement that the short-term prediction precision of the power load is gradually improved, the embodiment provides a short-term power load prediction model based on WOAWC optimized Bi-LSTM-Attention, and through experimental verification, the relevant conclusions are as follows:
1) Before the effect of the model is verified, periodic analysis and bidirectional information flow verification are firstly carried out on the load data, and the conclusion that the LSTM is used reasonably and the current time data is influenced by the past data and the future data is obtained. And then, standardizing the data, and making an evaluation index of the evaluation model.
2) After the Bi-LSTM model is constructed and the Attention is introduced, the experimental result verifies that the bidirectional network and the Attention mechanism have positive influence on the accuracy of power load prediction.
3) In the WOAWC-Bi-LSTM-Attention model, aiming at the problem of difficulty in selecting the super-parameters of the network, a group of super-parameters is found by using an improved whale optimization algorithm, so that the mean square error of the Bi-LSTM-Attention model is minimum. The experimental result shows that the evaluation indexes of the optimized model are reduced compared with those of the prior model, and the decision coefficient is closest to 1.
In future research, influences of more complex input characteristics such as date types and load characteristics on the power load can be considered, different intelligent algorithms are researched, and different methods are improved on the intelligent algorithms to analyze and compare the performance of the model, so that the short-term power load prediction accuracy and universality are further improved.
Claims (2)
1. A short-term power load prediction method based on a Bi-LSTM-Attention model is characterized in that time sequence historical load data and weather information data are used as input, bidirectional circulation training is carried out by using the Bi-LSTM neural network model, the positive and reverse laws of the load data are learned, an Attention mechanism is introduced on the basis of the model, and importance degrees of different features to the prediction model are highlighted by distributing weights for the features; meanwhile, aiming at the Bi-LSTM-Attention model, the optimized selection of the model hyper-parameters is realized through the improved whale optimization algorithm, the performance of the prediction model is further improved, and in addition, the local optimization capability of the algorithm is improved through the self-adaptive weight method.
2. The method of claim 1, in which the Bi-LSTM neural network model comprises an input layer, an embedding layer, a forward LSTM hidden layer, a backward LSTM hidden layer, an attention mechanism layer, a fully-connected layer, and an output layer; after the Bi-LSTM neural network model receives input information, time sequence data are transmitted into hidden layers of forward LSTM and reverse LSTM, the processed vectors are output by combining the hidden layers, an attention mechanism layer takes the data processed by the bidirectional LSTM as input, attention weight of the data is calculated, then normalization processing is used, and finally the weight vectors and corresponding features at the current moment are combined to obtain output of feature attention.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210675542.3A CN115238952A (en) | 2022-06-15 | 2022-06-15 | Bi-LSTM-Attention short-term power load prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210675542.3A CN115238952A (en) | 2022-06-15 | 2022-06-15 | Bi-LSTM-Attention short-term power load prediction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115238952A true CN115238952A (en) | 2022-10-25 |
Family
ID=83670270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210675542.3A Pending CN115238952A (en) | 2022-06-15 | 2022-06-15 | Bi-LSTM-Attention short-term power load prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115238952A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117350158A (en) * | 2023-10-13 | 2024-01-05 | 湖北华中电力科技开发有限责任公司 | Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm |
-
2022
- 2022-06-15 CN CN202210675542.3A patent/CN115238952A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117350158A (en) * | 2023-10-13 | 2024-01-05 | 湖北华中电力科技开发有限责任公司 | Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110705743B (en) | New energy consumption electric quantity prediction method based on long-term and short-term memory neural network | |
Peng et al. | Effective long short-term memory with differential evolution algorithm for electricity price prediction | |
CN109754113B (en) | Load prediction method based on dynamic time warping and long-and-short time memory | |
CN109902801B (en) | Flood collective forecasting method based on variational reasoning Bayesian neural network | |
CN111027772B (en) | Multi-factor short-term load prediction method based on PCA-DBILSTM | |
CN111260136A (en) | Building short-term load prediction method based on ARIMA-LSTM combined model | |
Hu et al. | Development and application of an evolutionary deep learning framework of LSTM based on improved grasshopper optimization algorithm for short-term load forecasting | |
Phyo et al. | Electricity load forecasting in Thailand using deep learning models | |
CN107506868B (en) | Method and device for predicting short-time power load | |
CN113516316B (en) | Attention-GRU short-term load prediction method based on sparrow search optimization | |
CN113554466A (en) | Short-term power consumption prediction model construction method, prediction method and device | |
Cui | District heating load prediction algorithm based on bidirectional long short-term memory network model | |
CN115759336A (en) | Prediction method and storage medium for short-term power load prediction | |
Dong et al. | Short-term building cooling load prediction model based on DwdAdam-ILSTM algorithm: A case study of a commercial building | |
CN111985719A (en) | Power load prediction method based on improved long-term and short-term memory network | |
Tavares et al. | Comparison of PV power generation forecasting in a residential building using ANN and DNN | |
CN115238952A (en) | Bi-LSTM-Attention short-term power load prediction method | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
Zhang | Short‐Term Power Load Forecasting Based on SAPSO‐CNN‐LSTM Model considering Autocorrelated Errors | |
CN110659775A (en) | LSTM-based improved electric power short-time load prediction algorithm | |
CN115759343A (en) | E-LSTM-based user electric quantity prediction method and device | |
Xu et al. | A classified identification deep-belief network for predicting electric-power load | |
Wei et al. | An attention-based cnn-gru model for resident load short-term forecast | |
Irankhah et al. | A parallel CNN-BiGRU network for short-term load forecasting in demand-side management | |
CN115293406A (en) | Photovoltaic power generation power prediction method based on Catboost and Radam-LSTM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |