CN115238952A - Bi-LSTM-Attention short-term power load prediction method - Google Patents

Bi-LSTM-Attention short-term power load prediction method Download PDF

Info

Publication number
CN115238952A
CN115238952A CN202210675542.3A CN202210675542A CN115238952A CN 115238952 A CN115238952 A CN 115238952A CN 202210675542 A CN202210675542 A CN 202210675542A CN 115238952 A CN115238952 A CN 115238952A
Authority
CN
China
Prior art keywords
lstm
model
attention
data
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210675542.3A
Other languages
Chinese (zh)
Inventor
冯增喜
葛珣
周瑶佳
李嘉乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Architecture and Technology
Original Assignee
Xian University of Architecture and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Architecture and Technology filed Critical Xian University of Architecture and Technology
Priority to CN202210675542.3A priority Critical patent/CN115238952A/en
Publication of CN115238952A publication Critical patent/CN115238952A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Biophysics (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Quality & Reliability (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Power Engineering (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a short-term power load prediction method based on a Bi-LSTM-Attention model, which is characterized in that time sequence historical load data and weather information data are used as input, a Bi-LSTM neural network model is used for carrying out bidirectional cyclic training, the positive and negative laws of the load data are learned, an Attention mechanism is introduced on the basis of the model, and the importance degree of different characteristics to a prediction model is highlighted by distributing weights for the characteristics; meanwhile, aiming at the Bi-LSTM-Attention model, the optimized selection of the model hyper-parameters is realized through the improved whale optimization algorithm, the performance of the prediction model is further improved, and in addition, the local optimization capability of the algorithm is improved through the self-adaptive weight method. The method has higher prediction precision compared with other models.

Description

Bi-LSTM-Attention short-term power load prediction method
Technical Field
The invention belongs to the technical field of energy conservation, relates to application of a computer in an energy conservation technology, and particularly relates to a short-term power load prediction method based on a Bi-LSTM-Attention model.
Background
With the development of various energy saving technologies, accurate load prediction plays an increasingly important role in energy conservation management. In recent years, attention has been paid to load prediction techniques. Typically, load forecasts include long-term load forecasts (LTLF) for loads over one year, medium-term load forecasts (MTLF) for loads from several weeks to one year, short-term load forecasts (STLF) for loads from one day to one week, and very short-time load forecasts (VSTLF) for loads from minutes to hours [1] . The LTLF and the MTLF can estimate the change trend of the load, and are suitable for long-term planning of the system in the design stage. STLF and VSTLF can generate accurate control and scheduling load requirements, better suited for short-term control of existing systems.
Load prediction is a type of study on time series prediction that has begun earlier in statistics and computer science. These methods have evolved from traditional statistical methods to today's artificial intelligence based models or hybrid models.
The most common models used in time series prediction are autoregressive models, moving average models, autoregressive integrated moving average models, seasonal integrated autoregressive moving average models. These models and methods focus on univariate data with linear relationships and time dependencies, which makes it less effective for time series with non-linear characteristics. The load belongs to a time series type with non-linear characteristics, and the load prediction is influenced by various random factors including weather conditions, time information and behavior of residents [2~4] And the like.
In recent years, with the rapid development of deep learning, a prediction model mainly based on a Recurrent Neural Network (RNN) has attracted much attention in processing time series data. Notably, long-short memory (LSTM) networks [5] Is proposed to promoteThe development of RNN effectively relieves the problems of gradient explosion and gradient disappearance existing in RNN by adding a gate control unit [6] . LSTM can identify the structure and pattern of data in time series prediction, such as non-linearity and complexity, and thus can predict complex time series with strong non-linearity. Literature reference [7] The LSTM is used for energy consumption prediction, and compared with a BP neural network, the LSTM has higher prediction accuracy. Marino et al attempted to use the LSTM method [8] Solve the same load prediction problem and show the literature [9] Similar results. While LSTM has many advantages in processing complex non-linear data, it also has its limitations. LSTM is more complex and more difficult to train, and in some cases does not perform as well as the simple ARIMA model [10] . To improve their performance, more and more researchers have improved predictive models by combining LSTM with traditional methods or other machine learning methods. For example, cai et al [11] Two deep learning models (RNN and CNN) were used for multi-step load prediction with the ARIMA method and compared. The result shows that the prediction precision of the deep learning-based model is improved by 22.6 percent compared with that of an ARIMA model.
Different types of sequence data tend to have different characteristics, which can have a large impact on the choice of predictive models, the settings of the model parameters, and the accuracy of the results. In conventional studies, the study data for load prediction is generally based on weather information, time information, and historical loads [12~15] . The bidirectional LSTM (bidirectional LSTM) which is established in recent years is a combination of forward LSTM and backward LSTM, and data can be fitted from the forward direction and the backward direction of the sequence to achieve higher prediction accuracy [16] . The attention mechanism is a method for keeping important information in different input characteristics in model training through weight distribution, improves the characteristic extraction capability of data, and can effectively improve the accuracy of power daily load prediction [17]
The following are relevant references that applicants have searched for and that these references are to be used in the present invention.
【1】Singh P,Dwivedi P.Integration of new evolutionary approach with artificial neural network for solving short term load forecast problem[J].Applied energy,2018,217:537-549。
【2】Khatoon S,Singh A K.Effects of various factors on electric load forecasting:An overview[C]//Proc of the 6th IEEE Power India International Conference(PIICON).Piscataway,NJ:IEEE Press,2014:1-5。
【3】Walter T,Price P N,Sohn M D.Uncertainty estimation improves energy measurement and verification procedures[J].Applied Energy,2014,130:230-236。
【4】Yan D,O’Brien W,Hong T,et al.Occupant behavior modeling for building performance simulation:Current state and future challenges[J].Energy and buildings,2015,107:264-278。
【5】Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural computation,1997,9(8):1735-1780。
【6】Vermaak J,Botha E C.Recurrent neural networks for short-term load forecasting[J].IEEE Trans on Power Systems,1998,13(1):126-132。
【7】 Zhang Ting Fei, luo Heng, liu Hang building energy consumption prediction method based on LSTM network [ J ]. Proceedings of Suzhou university of science and technology: nature's edition, 2020,37 (04): 78-84.
【8】Marino D L,Amarasinghe K,Manic M.Building energy load forecasting using deep neural networks[C]//Proc of the 42nd Annual Conference of the IEEE Industrial Electronics Society.Piscataway,NJ:IEEE Press,2016:7046-7051。
【9】Mocanu E,Nguyen P H,Gibescu M,et al.Deep learning for estimating building energy consumption[J].Sustainable Energy,Grids and Networks,2016,6:91-99。
【10】Makridakis S,Spiliotis E,Assimakopoulos V.Statistical and Machine Learning forecasting methods:Concerns and ways forward[J].PloS one,2018,13(3):e0194889。
【11】Cai M,Pipattanasomporn M,Rahman S.Day-ahead building-level load forecasts using deep learning vs.traditional time-series techniques[J].Applied energy,2019,236:1078-1088。
【12】Zhang J,Wei Y M,Li D,et al.Short term electricity load forecasting using a hybrid model[J].Energy,2018,158:774-781。
【13】Jain R K,Smith K M,Culligan P J,et al.Forecasting energy consumption of multi-family residential buildings using support vector regression:Investigating the impact of temporal and spatial monitoring granularity on performance accuracy[J].Applied Energy,2014,123:168-178。
【14】Amber K P,Aslam M W,Hussain S K.Electricity consumption forecasting models for administration buildings of the UK higher education sector[J].Energy and Buildings,2015,90:127-136。
【15】Grolinger K,L’Heureux A,Capretz M A M,et al.Energy forecasting for event venues:Big data and prediction accuracy[J].Energy and buildings,2016,112:222-233。
【16】Wu K,Wu J,Feng L,et al.An attention-based CNN-LSTM-BiLSTM model for short-term electric load forecasting in integrated energy system[J].International Transactions on Electrical Energy Systems,2021,31(1):e12637。
【17】 Zhao Bing, wang Zengping, ji Weijia, et al, CNN-GRU short term power load prediction method based on attention mechanism [ J ] grid technology, 2019, 43 (12): 4370-4376.
【18】Graves A,Jaitly N,Mohamed A.Hybrid speech recognition with deep bidirectional LSTM[C]//Proc of IEEE workshop on automatic speech recognition and understanding.Piscataway,NJ:IEEE Press,2013:273-278。
【19】Graves A,Schmidhuber J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural networks,2005,18(5-6):602-610。
【20】Wang Y,Huang M,Zhu X,et al.Attention-based LSTM for aspect-level sentiment classification[C]//Proc of EMNLP.Stroudsburg:ACL Press,2016:606-615。
【21】Mirjalili S,Lewis A.The whale optimization algorithm[J].Advances in engineering software,2016,95:51-67。
【22】Schuster M,Paliwal K.K.Bidirectional recurrent neural networks[J].IEEE Trans on Signal Processing,1997,45(11):2673-2681。
Disclosure of Invention
The invention aims to provide a short-term power load prediction method based on a Bi-LSTM-Attention model, aiming at the problems that power loads have high volatility and uncertainty and the traditional load prediction method has limitations when nonlinear time series data are processed.
In order to realize the task, the invention adopts the following technical solution:
a short-term power load prediction method based on a Bi-LSTM-Attention model is characterized in that time sequence historical load data and weather information data are used as input, bidirectional circulation training is carried out by using the Bi-LSTM neural network model, the positive and reverse laws of the load data are learned, an Attention mechanism is introduced on the basis of the model, and importance degrees of different features to the prediction model are highlighted by distributing weights for the features; meanwhile, aiming at the Bi-LSTM-Attention model, optimized selection of model hyper-parameters is achieved through an improved whale optimization algorithm, the performance of the prediction model is further improved, and in addition, the local optimization capability of the algorithm is improved through a self-adaptive weight method.
According to the invention, the Bi-LSTM neural network model comprises an input layer, an embedding layer, a forward LSTM hidden layer, a reverse LSTM hidden layer, an attention mechanism layer, a full connection layer and an output layer; after the Bi-LSTM neural network model receives input information, time sequence data are transmitted into hidden layers of forward LSTM and backward LSTM, and the hidden layers are combined to output processed vectors. And the attention mechanism layer takes the data processed by the bidirectional LSTM as input, calculates the attention weight of the data, then uses normalization processing, and finally combines the weight vector with the corresponding characteristic at the current moment to obtain the output of characteristic attention.
Compared with other models, the short-term power load prediction method based on the Bi-LSTM-Attention model has higher prediction precision, and brings technical innovation that:
1) Before the effect of the model is verified, periodic analysis and bidirectional information flow verification are carried out on the load data, and the conclusion that the LSTM is reasonable to use and the current time data are influenced by past and future data is obtained. And then, standardizing the data, and establishing an evaluation index of the evaluation model.
2) After the Bi-LSTM model is constructed and the Attention is introduced, the experimental result verifies that the bidirectional network and the Attention mechanism have positive influence on the accuracy of power load prediction.
3) In the WOAWC-Bi-LSTM-Attention model, aiming at the problem of difficulty in selecting the super-parameters of the network, a group of super-parameters is found by using an improved whale optimization algorithm, so that the mean square error of the Bi-LSTM-Attention model is minimum. The experimental result shows that the evaluation indexes of the optimized WOAWC-Bi-LSTM-orientation model are reduced compared with those of the prior model, and the determination coefficient is closest to 1.
Drawings
FIG. 1 is a diagram of an LSTM network architecture;
FIG. 2 is a diagram of a Bi-LSTM neural network architecture;
FIG. 3 is a schematic view of the attention mechanism;
FIG. 4 is a diagram of the structure of the Bi-LSTM-Attention model;
FIG. 5 is a flow chart of the WOAWC optimized Bi-LSTM-orientation;
FIG. 6 is a graph of the training process for each model, wherein (a) the graph is the LSTM loss function; (b) the graph is a BilSTM loss function; (c) the graph is a BilSTM-AT loss function; (d) a WOAWC-Bi-LSTM-AT loss function;
FIG. 7 is a one week load trend graph;
FIG. 8 is the autocorrelation coefficients of the forward and reverse sequences;
FIG. 9 is a fitness graph;
FIG. 10 is a different hyper-parameter optimization process;
fig. 11 is a comparison of prediction results.
FIG. 12 is a comparison of the results of the WOA and WOAWC optimization models for the native algorithm.
The present invention will be described in further detail with reference to the following drawings and examples.
Detailed Description
The embodiment provides a short-term power load prediction method based on a Bi-LSTM-Attention model, which mainly takes historical data of loads as input and considers the influence of outdoor temperature, relative humidity and time information. The BilSTM neural network learns the change rule of time sequence data, and combines the attention mechanism to highlight the influence of key features and distributes attention weight to carry out deep mining on the rule of load data. Meanwhile, aiming at the Bi-LSTM-Attention model, the improved whale optimization algorithm is used for optimizing the super-parameters of the model, so that the prediction performance is further improved. The experimental result shows that compared with an LSTM model, a Bi-LSTM model and a Bi-LSTM-Attention model, the model has higher prediction precision, and the error indexes MAPE, RMSE and MAE are all obviously reduced.
The specific implementation is as follows.
1. BilsTM-Attention prediction model
1.1 LSTM neural network
LSTM is a highly efficient RNN structure proposed by Hocherier and Schmidhuber in 1997 [18] . As in fig. 1, the top row of lines is a status cell, which refers to an internal memory. The lines across the bottom are the hidden layer states, and the gating cells f, i, o, and g are designed to solve the gradient vanishing problem. In network training, each gate learns the weights and biases separately. Where a forgetting gate helps the LSTM decide which information to discard from the state cell, the amount that can be adjusted through the previous hidden layer state. The input gates determine how much new information to store in the state cells and the output gates adjust the amount of hidden layer states in the next sequence. The LSTM network corresponding parameters are calculated as follows:
f t =σ(W fx x t +W fh h t-1 +b f ) (1)
i t =σ(W ix x t +W ih h t-1 +b i ) (2)
g t =σ(W gx x t +W gh h t-1 +b g ) (3)
o t =σ(W ox x t +W oh h t-1 +b o ) (4)
c t =g t ⊙i t +c t-1 ⊙f t (5)
h t =φ(c t )⊙o t (6)
in the formula (f) t ,i t ,o t ,c t The states of the forgetting gate, the input gate, the output gate and the state unit at the current time t are respectively; x is the number of t Inputting at the current time t; h is t-1 The previous time is the hidden layer state; g t For internal hidden layer states, based on x t And h t-1 Calculating to obtain; w fx ,W fh ,W ix ,W ih ,W gx ,W gh ,W ox ,W oh And b f ,b i ,b g ,b o Respectively corresponding weight matrix and bias item; σ (·), φ (·) represents the Sigmoid and tanh activation functions, respectively; as indicates a hadamard product.
1.2Bi-LSTM neural network
There is only one counterpropagating LSTM in the LSTM, which makes it possible to fit time-dependent data from only one direction when processing the data. Graves [19] A bidirectional LSTM is provided on the basis of the LSTM. Unlike unidirectional LSTM, the Bi-LSTM neural network adds a layer of inverse LSTM. The reverse LSTM performs reverse processing on the time sequence data, the hidden layer fuses forward information and reverse information, so that the network can effectively learn more time sequence data information, and the Bi-LSTM neural network structure is shown in figure 2.
The backward LSTM is computed in a similar manner to the forward LSTM, and information for subsequent time series is obtained only in the reverse direction. The calculation formula of the Bi-LSTM network is as follows:
h f =f(W f1 x t +W f2 h t-1 ) (7)
h b =f(W b1 x t +W b2 h t+1 ) (8)
wherein h is f Is the output of the forward LSTM network, h b Is the output of the inverse LSTM network, the final output of the hidden layer is:
y i =g(W o1 ⊙h f +W o2 ⊙h b ) (9)
1.3 Mechanism of Attention
The Attention mechanism is a probabilistic weighting mechanism that mimics human brain Attention [20] When the human brain observes things, it focuses on a specific place and ignores others, and the attention mechanism highlights more important features by assigning different probability weights to the inputs, thereby improving the accuracy of the model. Therefore, the BilSTM neural network combined with the Attention mechanism predicts the load, and can avoid the influence of complex features in the data, and the structure of the load is shown in FIG. 3.
In the figure, the value of the input sequence is x 1 To x n The value of the hidden layer state is h 1 To h n And alpha represents the attention weight of the hidden layer to the current input, and the calculation formula is as follows:
Figure BDA0003694446940000091
e t =u tanh(wh t +b) (11)
Figure BDA0003694446940000092
in the formula: e.g. of the type t Output h from LSTM layer for time t t The determined attention probability assignment; u and w are weight coefficients; b is a bias term; c. C t Is the output of the Attention layer at time t.
1.4 Bi-LSTM-Attention model
The Bi-LSTM-Attention model comprises an input layer, an embedding layer, a forward LSTM hidden layer, a reverse LSTM hidden layer, an Attention mechanism layer, a full connection layer and an output layer, and the structure of the Bi-LSTM-Attention model is shown in FIG. 4.
After the Bi-LSTM-Attention model receives input information, time sequence data are transmitted into hidden layers of forward LSTM and backward LSTM, and the hidden layers are combined to output processed vectors. The attention mechanism layer takes the bi-directional LSTM processed data as input, calculates its attention weight, and then uses a normalization process. And finally, combining the weight vector with the corresponding feature at the current moment to obtain the output of feature attention.
1.5 WOAWC optimized Bi-LSTM-Attention model
Whale Optimization Algorithm (WOA) is a novel group intelligent Optimization Algorithm proposed by Australian scholars Mirialili et al in 2016 [21] The method is a meta-heuristic algorithm for simulating the predation behavior of the whale population in nature. The method has the characteristics of simple principle, few set parameters, strong global search capability and the like, and has proved to be superior to the PSO algorithm in the aspects of processing the optimization of continuous functions on the aspects of solving precision and convergence speed, but the problems that the WOA algorithm is easy to fall into local optimum and the convergence precision is low still exist. In the embodiment, the positions of the whales are varied through an improved Whale Optimization Algorithm (WOAWC), the global search capability of the algorithm is improved, and in addition, the local optimization capability of the algorithm is improved through a self-adaptive weight method.
Further, the WOAWC principle is as follows:
(1) When the whale optimization algorithm is used for global search of the population, one whale needs to be randomly selected as a reference, so that other whales are randomly selected to be close to the reference whale. The selection of the reference whale in the original WOA algorithm is random, and the optimization of the algorithm on the global optimal solution is influenced. In this embodiment, the whale is mutated by using the cauchy inverse cumulative distribution function, so that individual whales are mutated to a wider range.
The Cauchi inverse cumulative distribution function formula is as follows:
Figure BDA0003694446940000101
when the whale carries out Cauchi adverse cumulative distribution variation, local optimization can be carried out by adopting a spiral wandering mode, so that the blind variation of the whale is avoided, and the formula is as follows:
Figure BDA0003694446940000102
Figure BDA0003694446940000103
in the formula (I), the compound is shown in the specification,
Figure BDA0003694446940000104
is a vector of coefficients that is a function of,
Figure BDA0003694446940000105
is a vector of the position of the object,
Figure BDA0003694446940000106
is a whale individual position vector randomly selected from the current population.
Using the Cauchy variation to rewrite the formulas (14) and (15):
Figure BDA0003694446940000111
in the formula, F -1 Is the inverse cumulative distribution function of the Cauchy distribution, x ij Is j position points of the ith whale before mutation, and r belongs to [0,1 ]]Is uniformly distributed.
(2) When whales are subjected to local optimization, an individual of the whales in WOA, which surrounds the catching stage and is closest to food, is equivalent to a current local optimal solution, and a formula when the other individuals approach to the optimal solution is as follows:
Figure BDA0003694446940000112
Figure BDA0003694446940000113
the invention provides a method for changing the position of an optimal whale individual at the moment and improving the local optimizing capability of the whale by adopting a self-adaptive weight method, wherein the self-adaptive weight formula is as follows:
Figure BDA0003694446940000114
where t is the current iteration number, t max The maximum number of iterations is indicated. Introducing adaptive weights to equation (17):
Figure BDA0003694446940000115
in order to improve the performance of the model, the network structure and the optimization parameters need to be optimized, and reasonable parameters are set so that the model can be converged to the global minimum value quickly. In this embodiment, 6 parameters including a learning rate (L), a training number (N), a batch size (B), a first hidden layer node number (H1), a second hidden layer node number (H2), and a full connection layer node number (F) of the wlst-Attention network model are optimized through the WOAWC, and meanwhile, a parameter search range is set to be limited to L e [0.001,0.01], N e [10, 100], B e [16, 128], H1 e [1, 128], H2 e [1, 128], and F e [1, 100] to prevent the search space from being too large to affect the optimization efficiency, and finally, the optimized WOAWC-Bi-LSTM-Attention model is verified, and a flow chart thereof is shown in fig. 5, and includes the following steps:
firstly, acquiring data, preprocessing the data, and then dividing a data set into a training set and a test set; wherein the training set enters a WOAWC-Bi-LSTM-orientation model for training, and the testing set enters the WOAWC-Bi-LSTM-orientation model for testing;
the WOAWC encodes an initial value, performs population initialization after calculating a fitness value, then performs update by the WOAWC population, and performs global optimal solution update after calculating the fitness value; if the condition is met, outputting the optimal network parameters, and if the condition is not met, returning to the WOAWC population updating step;
after calculating the fitness value and performing a population initialization step, the input parameters enter a WOAWC-Bi-LSTM-orientation model, the WOAWC decodes the input parameters to obtain corresponding 6 super-parameter values, and the WOAWC-Bi-LSTM-orientation model returns the fitness value;
after calculating the fitness value and carrying out the global optimal solution updating step, the input parameters enter a Bi-LSTM-orientation model, and the fitness value is transmitted back by the WOAWC-Bi-LSTM-orientation model.
1.6 loss function
In the embodiment, the training process of the prediction model is optimized by using an Adam algorithm, and the loss function is a mean square error function, that is:
Figure BDA0003694446940000121
in the formula: n is the number of samples; y is i And
Figure BDA0003694446940000122
the true value and the predicted value of the sample point. Load prediction was performed at 96 moments in conjunction with the study herein, so n =96. The loss function curve for each model training process is shown in fig. 6.
2. Example analysis
2.1 data Source and Pre-processing
In modern power systems, meteorological factors have increasingly significant influence on the load of the power system. Therefore, considering meteorological factors becomes one of the main means for the dispatch center to further improve the load prediction accuracy.
In the embodiment, the provided prediction method is verified by selecting a short-term load value all the year 2014 in a certain region provided by a public data set on a website, wherein the short-term load value comprises time information, weather information and a load value. One day of the data is divided into 96 time points (sampled once at 15 min), and modeling is carried out by constructing a rolling sequence, namely all values from day 1 to day n are input, 96 load values from day n +1 are output, all values from day 2 to day n +1 are input, 96 load values from day n +2 are output, and the like, so that the multi-input multi-output load prediction is constructed.
Different evaluation indexes generally have different dimensions and units, and the situation can influence the data analysis result. Therefore, to eliminate this difference, before training and verifying the data, the indexes are in the same order, and it is necessary to normalize the data and process the data by Min-Max method.
Figure BDA0003694446940000131
In the formula: x is the raw data; x is normalized data; x is a radical of a fluorine atom min 、x max The minimum and maximum values of the data, respectively. Mapping the normalized data to [ -1,1]An interval.
2.2 data validation
For the rigor of the experiment, the prediction method is verified by combining the experimental data of the power load. Fig. 7 shows the load trend during the week, with one day of the data divided into 96 time points. As can be seen from the figure, the load data fluctuates according to a certain frequency, the whole system has periodicity, and the LSTM method is reasonable to select.
Compared with the traditional LSTM neural network, the Bi-LSTM neural network considers the internal rules of forward and backward data at the same time, and develops prediction from history and future directions [22] . Therefore, the load prediction is performed while considering the influence of the historical load and the future load on the prediction accuracy.
In order to verify that the load data has bidirectional information flow, load data of a month in a data set is selected, the load data is divided into a forward load sequence and a reverse load sequence along the center of the month data in a forward direction and a reverse direction, autocorrelation coefficients of the two sequences are respectively calculated, and as can be seen from fig. 8, a time sequence of the load has obvious forward and reverse laws.
2.3 evaluation index
To evaluate the performance of the prediction model, the error indicators used in this embodiment are: mean percent error MAPE, mean absolute error MAE, root mean square error RMSE, coefficient of determination R 2 The expressions are respectively as follows:
Figure BDA0003694446940000141
Figure BDA0003694446940000142
Figure BDA0003694446940000143
Figure BDA0003694446940000144
in the formula: n is the total number; y is i And
Figure BDA0003694446940000145
the real value and the predicted value of the ith sample point are respectively.
2.4 prediction results and comparative analysis
The experimental computer is configured as a Windows 10-bit operating system, and the GPU is NVIDIA GeForce RTX 2070s 8G. The programming software used was python3.7 and the compilation environments were tensoflow2.6.0 and keras2.2.4.
The model training set is the input data at 364 days before 2014, and the test set is the load data at the last two days. As shown in FIG. 9, the adaptation curve of the model WOAWC-Bi-LSTM-Attention is finally stabilized to 0.0027 by improving the iterative optimization of the Whale Optimization Algorithm (WOAWC). The 6-hyperparameter iteration is shown in FIG. 10, and the final stable values are shown in Table 1. In the comparative experiment of this embodiment, the BPNN model, the LSTM model, the BiLSTM model and the BiLSTM-Attention model are selected for load prediction, and the prediction results are shown in fig. 11. Fig. 12 shows a comparison graph of the results of the WOA optimized network model and the WOAWC optimized network model of the original algorithm.
The result of FIG. 11 shows that the load value after the WOAWC is optimized to the hyperparameter of the BilSTM-Attention network is best fitted, and the prediction result is closer to the true value. The evaluation index of the predictive performance of each model is shown in Table 2.
Table 1: result of parameter optimization
Parameter(s) Best results
Learning rate 0.00552
Number of training sessions 98
Batch size 40
Number of nodes of first hidden layer 100
Number of nodes of second hidden layer 74
Number of full connection layers 61
Table 2: comparison of prediction errors of different models
Figure BDA0003694446940000151
As is evident from Table 2, the LSTM-based prediction model performed better than the BP model in time series data prediction. The two-way LSTM model is superior to the one-way LSTM model, indicating that BiLSTM can better find features in the sequence. The errors of MAPE, RMSE and MAE of the Bi-LSTM model added with the Attention are respectively reduced by 2.73%, 5.7% and 12.42% compared with the errors of the Bi-LSTM model, and the Attention improves the prediction effect on the excavation of different feature contribution degrees. After 6 kinds of hyper-parameters of the network model are optimized by using the improved whale optimization algorithm, the prediction performance of the model is further improved, and R 2 The value reaches above 0.99.
3. Conclusion
Aiming at the requirement that the short-term prediction precision of the power load is gradually improved, the embodiment provides a short-term power load prediction model based on WOAWC optimized Bi-LSTM-Attention, and through experimental verification, the relevant conclusions are as follows:
1) Before the effect of the model is verified, periodic analysis and bidirectional information flow verification are firstly carried out on the load data, and the conclusion that the LSTM is used reasonably and the current time data is influenced by the past data and the future data is obtained. And then, standardizing the data, and making an evaluation index of the evaluation model.
2) After the Bi-LSTM model is constructed and the Attention is introduced, the experimental result verifies that the bidirectional network and the Attention mechanism have positive influence on the accuracy of power load prediction.
3) In the WOAWC-Bi-LSTM-Attention model, aiming at the problem of difficulty in selecting the super-parameters of the network, a group of super-parameters is found by using an improved whale optimization algorithm, so that the mean square error of the Bi-LSTM-Attention model is minimum. The experimental result shows that the evaluation indexes of the optimized model are reduced compared with those of the prior model, and the decision coefficient is closest to 1.
In future research, influences of more complex input characteristics such as date types and load characteristics on the power load can be considered, different intelligent algorithms are researched, and different methods are improved on the intelligent algorithms to analyze and compare the performance of the model, so that the short-term power load prediction accuracy and universality are further improved.

Claims (2)

1. A short-term power load prediction method based on a Bi-LSTM-Attention model is characterized in that time sequence historical load data and weather information data are used as input, bidirectional circulation training is carried out by using the Bi-LSTM neural network model, the positive and reverse laws of the load data are learned, an Attention mechanism is introduced on the basis of the model, and importance degrees of different features to the prediction model are highlighted by distributing weights for the features; meanwhile, aiming at the Bi-LSTM-Attention model, the optimized selection of the model hyper-parameters is realized through the improved whale optimization algorithm, the performance of the prediction model is further improved, and in addition, the local optimization capability of the algorithm is improved through the self-adaptive weight method.
2. The method of claim 1, in which the Bi-LSTM neural network model comprises an input layer, an embedding layer, a forward LSTM hidden layer, a backward LSTM hidden layer, an attention mechanism layer, a fully-connected layer, and an output layer; after the Bi-LSTM neural network model receives input information, time sequence data are transmitted into hidden layers of forward LSTM and reverse LSTM, the processed vectors are output by combining the hidden layers, an attention mechanism layer takes the data processed by the bidirectional LSTM as input, attention weight of the data is calculated, then normalization processing is used, and finally the weight vectors and corresponding features at the current moment are combined to obtain output of feature attention.
CN202210675542.3A 2022-06-15 2022-06-15 Bi-LSTM-Attention short-term power load prediction method Pending CN115238952A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210675542.3A CN115238952A (en) 2022-06-15 2022-06-15 Bi-LSTM-Attention short-term power load prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210675542.3A CN115238952A (en) 2022-06-15 2022-06-15 Bi-LSTM-Attention short-term power load prediction method

Publications (1)

Publication Number Publication Date
CN115238952A true CN115238952A (en) 2022-10-25

Family

ID=83670270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210675542.3A Pending CN115238952A (en) 2022-06-15 2022-06-15 Bi-LSTM-Attention short-term power load prediction method

Country Status (1)

Country Link
CN (1) CN115238952A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350158A (en) * 2023-10-13 2024-01-05 湖北华中电力科技开发有限责任公司 Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350158A (en) * 2023-10-13 2024-01-05 湖北华中电力科技开发有限责任公司 Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm

Similar Documents

Publication Publication Date Title
CN110705743B (en) New energy consumption electric quantity prediction method based on long-term and short-term memory neural network
Peng et al. Effective long short-term memory with differential evolution algorithm for electricity price prediction
CN109754113B (en) Load prediction method based on dynamic time warping and long-and-short time memory
CN109902801B (en) Flood collective forecasting method based on variational reasoning Bayesian neural network
CN111027772B (en) Multi-factor short-term load prediction method based on PCA-DBILSTM
CN111260136A (en) Building short-term load prediction method based on ARIMA-LSTM combined model
Hu et al. Development and application of an evolutionary deep learning framework of LSTM based on improved grasshopper optimization algorithm for short-term load forecasting
Phyo et al. Electricity load forecasting in Thailand using deep learning models
CN107506868B (en) Method and device for predicting short-time power load
CN113516316B (en) Attention-GRU short-term load prediction method based on sparrow search optimization
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
Cui District heating load prediction algorithm based on bidirectional long short-term memory network model
CN115759336A (en) Prediction method and storage medium for short-term power load prediction
Dong et al. Short-term building cooling load prediction model based on DwdAdam-ILSTM algorithm: A case study of a commercial building
CN111985719A (en) Power load prediction method based on improved long-term and short-term memory network
Tavares et al. Comparison of PV power generation forecasting in a residential building using ANN and DNN
CN115238952A (en) Bi-LSTM-Attention short-term power load prediction method
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
Zhang Short‐Term Power Load Forecasting Based on SAPSO‐CNN‐LSTM Model considering Autocorrelated Errors
CN110659775A (en) LSTM-based improved electric power short-time load prediction algorithm
CN115759343A (en) E-LSTM-based user electric quantity prediction method and device
Xu et al. A classified identification deep-belief network for predicting electric-power load
Wei et al. An attention-based cnn-gru model for resident load short-term forecast
Irankhah et al. A parallel CNN-BiGRU network for short-term load forecasting in demand-side management
CN115293406A (en) Photovoltaic power generation power prediction method based on Catboost and Radam-LSTM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination