CN117977568A - Power load prediction method based on nested LSTM and quantile calculation - Google Patents

Power load prediction method based on nested LSTM and quantile calculation Download PDF

Info

Publication number
CN117977568A
CN117977568A CN202410049336.0A CN202410049336A CN117977568A CN 117977568 A CN117977568 A CN 117977568A CN 202410049336 A CN202410049336 A CN 202410049336A CN 117977568 A CN117977568 A CN 117977568A
Authority
CN
China
Prior art keywords
quantile
lstm
nested
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410049336.0A
Other languages
Chinese (zh)
Inventor
李丹
张远航
孙光帆
杨保华
王奇
缪书唯
李振兴
刘颂凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Three Gorges University CTGU
Original Assignee
China Three Gorges University CTGU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Three Gorges University CTGU filed Critical China Three Gorges University CTGU
Priority to CN202410049336.0A priority Critical patent/CN117977568A/en
Publication of CN117977568A publication Critical patent/CN117977568A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Tourism & Hospitality (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a power load prediction method based on nested LSTM and quantile calculation, which comprises the steps of collecting load power and influence factor data of a plurality of sample days to form a data set; establishing a nested LSTM model, and pre-training each quantile LSTM in the nested LSTM model to obtain a weight and bias parameter set; performing overall training on the nested LSTM model, and performing fine adjustment on the weight and the bias parameter in the training process to determine the optimal weight and the bias parameter of the nested LSTM model; inputting the verification set into a trained nested LSTM model, and selecting the optimal super parameters of the model according to the verification error; and inputting the test sample into a nested LSTM model with the optimal super parameters, and performing inverse normalization on a prediction result output by the nested LSTM model. According to the invention, the nested LSTM model is adopted to carry out quantile regression prediction of the power load, so that the probability distribution of the predicted load is more reasonable, and the intersection between quantile predicted values is avoided.

Description

Power load prediction method based on nested LSTM and quantile calculation
Technical Field
The invention belongs to the field of power load prediction, and particularly relates to a power load prediction method based on nested LSTM and quantile calculation.
Background
Short-term power load prediction is the basis of safe and economic operation of a power system, and provides important information for power system planning and operation, energy transaction, unit start-stop, economic dispatch and the like. The improvement of the accuracy of load prediction is beneficial to improving the utilization rate of power equipment and reducing the energy waste to the greatest extent.
At present, the load probability prediction method mainly comprises interval estimation, kernel density estimation, quantile regression and the like. The first two methods are mainly based on parameter statistics estimation probability distribution of point prediction errors, and fractional regression can directly explain the relation between response variables and dependent variables under different fractional points, so that the method becomes a focus of attention of load probability prediction literature in recent years. However, the quantile predicted value of quantile regression has the phenomenon of crossing, which results in unreasonable.
The load probability prediction method is to combine a machine learning algorithm and a quantile regression method to construct a quantile model. However, conventional machine learning algorithms often require processing of data using feature engineering. Deep learning neural networks have proven to be more efficient in coping with short-term load predictions for large data sets than traditional machine learning methods. In particular, long short-term memory (LSTM) neural networks, as shown in fig. 2, have been widely used because of their strong adaptability to time-series forms of data.
Therefore, a short-term power load probability prediction method based on nested LSTM neural network fractional regression is studied.
Disclosure of Invention
The technical problem of the invention is that the quantile predictive value of the existing quantile regression method of the power load has the phenomenon of crossing, so that the method is unreasonable.
The invention aims to solve the problems and provide a power load prediction method based on nested LSTM and quantile calculation, which combines the robustness and memory characteristics of LSTM with the quantile regression probability prediction function, considers the inherent characteristics of the quantile of the predicted load probability, adds a combination layer considering the constraint relation between quantile prediction values, and constructs a nested LSTM, namely constraint parallel long-short-term memory network model ((constrained parallel Long-Short Term Memory, CP-LSTM), so that the predicted load probability distribution is more reasonable, and the intersection between quantile prediction values is avoided.
The technical proposal of the invention is a power load prediction method based on nested LSTM and quantile calculation, which comprises the following steps,
Step 1: collecting load power and influence factor data of a plurality of sample days, forming a data set and dividing the data set into a training set, a verification set and a test set;
Step 2: establishing a nested LSTM model and setting model super-parameters; pre-training the parallel LSTM under each quantile in the nested LSTM model by adopting a parallel training method to obtain a global parameter set { W (τi), b (τi) } opt;
step 3: taking the obtained global parameter set { W (τi), b (τi) } opt as initial parameters of the nested LSTM model, carrying out overall training on the nested LSTM model, and carrying out fine adjustment on the weight and the bias parameters in the training process to determine the optimal weight and the bias parameters of the nested LSTM model;
step 4: inputting the verification set into a trained nested LSTM model, and selecting the optimal super parameters of the model according to the verification error;
step 5: inputting the test sample into a nested LSTM model with optimal super parameters, and performing inverse normalization on a prediction result output by the nested LSTM model to obtain a plurality of quantile prediction values of the prediction load at each moment in the prediction day;
step 6: and (5) calculating to obtain a probability density curve of the predicted point according to the plurality of quantiles of the predicted load obtained in the step (5).
Preferably, step 1 further comprises normalizing the various types of data of the dataset to normalize the data variable to the [ -1,1] interval.
Specifically, 96-point load power data of 15 minutes from adjacent time points at 0 time to 24 time points are collected on a sample day, 96-point load power of a day before prediction, 24-time air temperature and regional rainfall on the day after prediction are selected to form a multidimensional characteristic input variable vector, 96-point load quantile on the day after prediction is used as an output variable vector, an input variable X d=[Td,Rd and an air temperature T d=[T1, T2,…, T24]d are used, wherein T i, i epsilon {1,2, …,24} represents weather temperature measured when i, rainfall R d=[R1, R2,…,RM]d, wherein R j, j epsilon {1,2, …, M } represents rainfall of a j-th subarea of a prediction area, D epsilon {1,2, …, D } and D is the total number of days of a historical sample, and M is the number of subareas contained in the prediction area.
In step 2, the model super-parameters include the number m of neurons, the time window length l of the samples, the node number n and the penalty parameter lambda 1、λ2.
Preferably, the parallel training is implemented through GPU distributed computing, the training set is equally divided into a plurality of subsets, and is distributed to each node of the computing system, each computing node is responsible for processing a different subset of the data set, so as to reduce the total time of training the neural network, the parameter set obtained by training each node is used for calculating a new global weight set by using a gradient descent formula, and then is distributed to each node of the computing system, and the formula is as follows:
Wherein Z φ={W, b}(φ) is a global parameter set obtained by phi-th iterative training, delta Z φ,j is a parameter gradient of the j-th computing node obtained by phi-th iterative training, n is the total number of computing nodes, Is a scaling factor.
In step3, the weight and the bias parameters are finely adjusted, and the weight and the bias parameters are finely adjusted by using a gradient descent algorithm according to the loss function.
Preferably, the probability density curve of the predicted point obtained by calculation adopts a Gaussian kernel density estimation method.
Preferably, step 1 divides the data set into a training set, a validation set and a test set in a ratio of 8:1:1.
Preferably, the prediction result in step 5 adopts an evaluation index considering the constraint relation of the quantiles to evaluate the crossing condition of the quantiles, and the quantile prediction value at the time t should satisfy the condition as known from the inherent attribute of the quantiles
The index for accounting for the fractional number constraint relationship is as follows:
Wherein the method comprises the steps of An evaluation index value indicating a constraint relation of the quantiles; /(I)Is a predicted value under the t-moment quantile, N is the total number of test moments, v t,i is a constraint violation degree function, θ=τ i+1i is the step length between the quantiles, and is a constant; v t,i is 0 when the constraint relationship is satisfied between adjacent quantiles, and v t,i is a positive difference value between adjacent quantiles when the constraint relationship is violated, reflecting the degree of constraint violation. The coefficient term 2 theta/N is a normalized coefficient of the quantile constraint error square, whereby the calculated X CS is the normalized root mean square of v t,i over the whole test set sample and all adjacent quantiles. The crossing condition reflecting the quantiles can be quantified by X CS.
When the probability prediction evaluation indexes X QS and X CS are simultaneously lower, the predicted quantiles have better performance, and the probability prediction evaluation indexes X QCS are formed by combining the probability prediction evaluation indexes X QS and the probability prediction evaluation indexes X CS:
Compared with the prior art, the invention has the beneficial effects that:
1) According to the invention, the nested LSTM model is adopted to carry out quantile regression prediction of the power load, so that the probability distribution of the predicted load is more reasonable, and the intersection between quantile predicted values is avoided.
2) The parallel training method is adopted to pretrain each quantile LSTM in the nested LSTM model, a weight and bias parameter set is obtained as initial parameters of the nested LSTM model, then overall training is carried out, fine adjustment is carried out on the weight and bias parameters, and the optimal weight and bias parameters of the nested LSTM model are obtained, so that the model prediction efficiency is higher, and accurate point prediction results can be obtained.
3) The evaluation index considering the quantile constraint relation provided by the invention can be used for evaluating the crossing condition of quantiles.
Drawings
The invention is further described below with reference to the drawings and examples.
Fig. 1 is a flowchart of a power load probability prediction method according to an embodiment.
Fig. 2 is a schematic view of LSTM structure.
Fig. 3 is a schematic structural diagram of a nested LSTM model of an embodiment.
FIG. 4 is a schematic diagram of parallel training of an embodiment.
FIG. 5 is a schematic diagram of a training process for a parallel LSTM.
FIG. 6 is a comparative diagram of evaluation index Xcs of sample days of test sets obtained by different predictive models in the examples.
Detailed Description
As shown in fig. 1, the power load prediction method based on nested LSTM and quantile calculation, includes the steps of,
Step 1: load data, air temperature data and rainfall which are 15 minutes apart from each other in the period from 1 month 1 in 2016 to 30 months 6 in 2017 in a certain actual area are collected to form a data set, the data set is divided into a training set, a verification set and a test set according to the proportion of 8:1:1, and variables X d=[Td,Rd are input, wherein the data set comprises an air temperature T d=[T1, T2,…, T24]d at 24 days and rainfall R d=[R1, R2,…,RM]d of M subareas; considering that the data difference between the data is relatively large, different data needs to be normalized into [ -1,1], and the input sample after normalization is; Sample data before normalization processing is/>The maximum and minimum sample values are respectively、/>The number of samples is N, and the specific processing formula is as follows:
step 2: a nested LSTM model is built, as shown in FIG. 3, which includes an input layer, a hidden layer, an output layer, and a regression layer, the hidden layer including a plurality of quantile long and short term memory network models (Quantile Long-Short Term Memory, Q-LSTM).
Setting model super parameters, including the number m of neurons, the length l of a sample time window, the number n of calculation nodes and penalty parameters lambda 1、λ2; in the embodiment, the value of m is 200, the value of the time window length l is 6, the value of lambda 1 is 1, the value of lambda 2 is 20, and the total sample day is 547 days.
The parallel training method is adopted to pretrain parallel LSTM under each sub-point in the nested LSTM model, the training set is divided into n equal subsets, and the corresponding n computing nodes are utilized to train the network in parallel;
As shown in fig. 4, the data parallel training of the neural network is implemented through GPU distributed computing, the training set is equally divided into a plurality of subsets, and distributed to each node of the computing system, each computing node is responsible for processing a different subset of the data set, so as to reduce the total time of training the neural network, each node trains the data subset thereof to obtain a set of model parameters, the parameter set obtained by training each node calculates a new global weight set by using a gradient descent formula, and then distributed to each node of the computing system, and the formula is as follows:
Wherein Z φ={W, b}(φ) is a global parameter set obtained by phi-th iterative training, delta Z φ,j is a parameter gradient of the j-th computing node obtained by phi-th iterative training, n is the total number of computing nodes, For scaling factors, the learning rate is similar.
As shown in fig. 5, the model Q-LSTM trained separately for each node is trained as follows:
(4) Inputting an initial weight W 0(τi) and an initial bias b 0(τi);
(5) Input gate for calculating LSTM Forgetting door/>Output door/>Candidate memory cell/>New memory stateHidden layer state/>Current iteration value/>、/>、/>、/>、/>、/>The calculation process is as follows:
Given the current input x t, the hidden layer state h t-1 and the storage state C t-1 at the previous time, the detailed calculation process is as follows:
Wherein W i、Wf、Wo、Wc represents the corresponding weight matrix, and b i、bf、bo、bc represents the corresponding bias vector; sigma () and tanh () are Sigmoid and tangent Sigmoid curve activation functions, respectively; final output of output layer Calculated from the hidden layer state h t:
Where W S is the implicit layer-output layer connection weight matrix and b S represents the corresponding bias vector.
(6) Gradient calculation using gradient descent method based on loss functionAnd/>And from this the gradient of each weight and bias is calculated, the loss function is as follows:
Wherein the method comprises the steps of
W(τi)={Wfi),Wii),Wci),Woi),WSi)}b(τi)={bfi),bii),bci),boi),bSi)}
Respectively are quantilesAll weight parameter matrix sets and bias vector sets of the lower LSTM neural network; lambda 1 is a regularized penalty parameter that prevents model training from fitting,/>(A) As a test function, it is defined as:
Defining a gradient function And/>The following are provided:
As a loss function/> For hidden layer state/>Differentiation of/>As a loss function/>For storage state/>Is a derivative of (a).
1) The gradient of hidden layer to output layer parameters is:
to hide layer state/> Differentiating the connection weight matrix W S of the hidden layer and the output layer,/>To hide layer state/>Differential the bias vector b S.
2) 2) According to、/>Respectively calculating gradients of parameters of the forgetting gate, the input gate, the candidate storage unit and the output gate;
(5) Updating the weights and offsets as follows:
Where η is the learning rate, and W * and b * represent the corresponding weight matrix and bias vector, respectively.
And (3) repeating the steps (2) to (4) until the convergence condition is reached, and obtaining the optimal parameters { W (tau i), b(τi)}opt) of the model.
Step 3: the obtained weight and bias parameter set { W (tau i), b(τi)}opt is taken as the initial parameter of a nested LSTM model, the nested LSTM model is integrally trained, the { W (tau i),b(τi)}r is finely tuned, the optimal weight and bias parameter of the CP-LSTM short-term load probability prediction model are determined), in order to obtain the optimal parameter of the nested LSTM model, a gradient descent method is adopted to search the model parameter { W (tau i),b(τi)}opt; the training method of the nested LSTM model is consistent with the Q-LSTM training method, only the loss function and the gradient are different, and the loss function of the nested LSTM model is searched based on the training sample setThe following are provided:
Wherein the method comprises the steps of ,/>To violate the penalty parameters of the constraint, the corresponding gradient/>、/>And/>The phase change is as follows:
The elements in the vector u i are respectively:
the gradient calculation of the forgetting gate, the input gate, the storage unit, the candidate storage unit and the output gate parameters is the same as the calculation mode in the step 3.
Step 4: inputting the verification set into the nested LSTM model trained in the step 3, and selecting the most superior super parameters according to the verification error.
10% Of the sample data for day 547 of the example was used for validation and the best super-parameters were chosen based on the error of the final output result from the true value.
Step 5: inputting the test sample into a nested LSTM model with optimal super parameters to obtain an output result, converting the output result into different dimensions, namely, inversely normalizing, and finally carrying out comparative analysis on the predicted data and the real result; considering that the quantile prediction result meets the quantile Constraint condition, the invention provides an evaluation index Constraint Score (CS) considering the quantile Constraint relation on the basis of a common probability prediction evaluation index Quantile Score (QS). From the inherent attribute of quantiles, the quantile predictive value at time t should satisfyAccording to the method, the index considering the quantile constraint relation is as follows:
Wherein the method comprises the steps of An evaluation index value indicating a constraint relation of the quantiles; /(I)Is a predicted value under the t-moment quantile, N is the total number of test moments, v t,i is a constraint violation degree function, θ=τ i+1i is the step length between the quantiles, and is a constant; v t,i is 0 when the constraint relationship is satisfied between adjacent quantiles, and v t,i is a positive difference value between adjacent quantiles when the constraint relationship is violated, reflecting the degree of constraint violation. The coefficient term 2 theta/N is a normalized coefficient of the quantile constraint error square, whereby the calculated X CS is the normalized root mean square of v t,i over the whole test set sample and all adjacent quantiles. The crossing condition reflecting the quantiles can be quantified by X CS.
When X QS and X CS are simultaneously lower, the predicted quantiles have better performance, and the comprehensive evaluation index X QCS is formed by combining the two components:
furthermore, the reliability index PI coverage probability deviation index (PICP) and the sharpness index PI standard root mean square width (PINRW) of the prediction interval (prediction interval, PI) are also important indexes for the evaluation of the probability prediction result.
Common probability prediction evaluation index X QS:
Wherein the method comprises the steps of Is the quantile/>At pinball losses, y t is the actual value of the power load at time t,/>Is time t/>And the predicted value under the quantile, N is the total number of test moments.
Reliability index X PICP:
where ε α represents the number of prediction intervals that the actual value falls within under confidence 1- α.
Deviation of the actual coverage PICP of PI from the nominal value (PI nominal confidence, PINC) covers the probability deviation index X Dev:
Sharpness index X PINRW:
Wherein X PINRW α is the normalized root mean square width of the prediction section under the confidence coefficient of 1-alpha, U t α and L t α are the upper limit and the lower limit of the prediction section of the t test sample under the confidence coefficient of 1-alpha, and R is the difference between the maximum value and the minimum value of the load in the test set.
Step 6: according to the multiple quantiles of the predicted load obtained in the step 5, a probability density curve of the predicted point is calculated by adopting a Gaussian kernel density estimation method, and the Gaussian kernel density estimation method is disclosed by a paper "Short-term power load probability density forecasting based on Yeo-Johnson transformation quantile regression and Gaussian kernel function" published in journal Energy 2018.
In the embodiment, a 15-minute-level load data set from 1 st year to 1 st 6th year to 30 th year in a practical area is selected, and the daily preload probability is predicted by the method. To verify the predictive performance of the nested LSTM model, it was compared to a linear quantile regression model L-QR, quantile neural networks bQRNN, QRNN with parametric rectified linear activation function RCLU, and Q-LSTM without the addition of a combining layer. The evaluation index statistics of the probability prediction results of each model are shown in tables 1 and 2, and table 1 lists training time length T train, common probability prediction evaluation index X QS, index X CS considering quantile constraint relation, comprehensive evaluation index X QCS and sharpness index X PINRW under 50% confidence and sample ratio f against adjacent quantile constraint relation; table 2 shows a comparison of the reliability index X PICP and the bias index X Dev at different confidence levels, where X AD、XMD is the mean, maximum, respectively, of X Dev at each confidence level.
As can be seen from the combination of FIG. 6 and Table 1, the X CS index of the nested LSTM, i.e. the CP-LSTM, is significantly lower than that of the other methods in most sample days, and the comprehensive X CS index of the CP-LSTM in the whole test set is only 27.28% of that of the Q-LSTM, and the proportion f of the samples against the constraint in the whole test set sample is reduced by 16.3% compared with that of the Q-LSTM, but the X QS index reflecting the prediction accuracy is not significantly changed. The CP-LSTM can effectively avoid quantile crossing and improve the rationality of the predicted quantile on the premise of not reducing the prediction precision.
Table 1 comparison table of evaluation indexes of various models
Table 2 comparison table of models X PICP and X Dev
The scope of the present invention is not limited thereto, and although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (9)

1. The power load prediction method based on nested LSTM and quantile calculation is characterized by comprising the following steps of:
Step 1: collecting load power and influence factor data of a plurality of sample days, forming a data set and dividing the data set into a training set, a verification set and a test set;
Step 2: establishing a nested LSTM model and setting model super-parameters; pretraining each quantile LSTM in the nested LSTM model by adopting a parallel training method to obtain a weight and bias parameter set;
The split-site long-short-term memory network model Q-LSTM comprises an input gate Forgetting door/>Output door/>Candidate memory cell
Step 3: taking the obtained weight and bias parameter set as initial parameters of the nested LSTM model, carrying out overall training on the nested LSTM model, and carrying out fine adjustment on the weight and bias parameters in the training process to determine the optimal weight and bias parameters of the nested LSTM model;
step 4: inputting the verification set into a trained nested LSTM model, and selecting the optimal super parameters of the model according to the verification error;
Step 5: and inputting the test sample into a nested LSTM model with the optimal super parameters, and performing inverse normalization on a prediction result output by the nested LSTM model to obtain a plurality of quantile predicted values of the predicted load at each moment in the prediction day.
2. The method for power load prediction based on nested LSTM and quantile calculation of claim 1, further comprising step 6: and (5) calculating to obtain a probability density curve of the predicted point according to the plurality of quantiles of the predicted load obtained in the step (5).
3. The method for predicting the electric load based on the nested LSTM and quantile calculation according to claim 2, wherein step 1 collects 96-point load power data of 15 minutes from adjacent time points in 0 to 24 times for a sample day, selects 96-point load power of the day before prediction, 24-time air temperature and regional rainfall on the day before prediction to form a multidimensional feature input variable vector, takes 96-point load quantile on the day after prediction as an output variable vector, inputs variable X d=[Td, Rd and air temperature T d=[T1, T2,…, T24]d, wherein T i, i e {1,2, …,24} represents weather temperature measured when i, rainfall R d=[R1, R2,…, RM]d, wherein R j, j e {1,2, …, M } represents rainfall of a j-th sub-area of a prediction area, D e {1,2, …, D } is total number of days of historical samples, and M is the number of sub-areas included in the prediction area.
4. A method of power load prediction based on nested LSTM and quantile calculations as claimed in claim 3, wherein in step 2, the model hyper-parameters include the number of neurons m, the time window length of samples l, the node number n and the penalty term parameter λ 1、λ2.
5. The method for predicting the power load based on nested LSTM and quantile calculation according to claim 4, wherein the parallel training is implemented by GPU distributed computing, the training set is equally divided into a plurality of subsets and distributed to each node of the computing system, each computing node is responsible for processing a different subset of the data set, thereby reducing the total time of training the neural network, the parameter set obtained by training each node is used for calculating a new global weight set by using a gradient descent formula, and then distributed to each node of the computing system, and the formula is as follows:
Wherein Z φ={W, b}(φ) is a global parameter set obtained by phi-th iterative training, delta Z φ,j is a parameter gradient of the j-th computing node obtained by phi-th iterative training, n is the total number of computing nodes, Is a scaling factor.
6. The method for predicting the power load based on nested LSTM and quantile calculation of claim 5, wherein the quantile long-short-term memory network model Q-LSTM trained by each node individually is trained as follows:
(1) Inputting an initial weight W 0(τi) and an initial bias b 0(τi);
(2) Input gate for calculating LSTM Forgetting door/>Output door/>Candidate memory cell/>New memory state/>Hidden layer state/>Current iteration value/>、/>、/>、/>、/>、/>The calculation process is as follows:
Given the current input x t, the hidden layer state h t-1 and the storage state C t-1 at the previous time, the detailed calculation process is as follows:
Wherein W i、Wf、Wo、Wc represents the corresponding weight matrix, and b i、bf、bo、bc represents the corresponding bias vector; sigma () and tanh () are Sigmoid and tangent Sigmoid curve activation functions, respectively; final output of output layer Calculated from the hidden layer state h t:
wherein W S is a connection weight matrix of the hidden layer and the output layer, and b S represents a corresponding bias vector;
(3) Gradient calculation using gradient descent method based on loss function And/>And from this the gradient of each weight and bias is calculated, the loss function is as follows:
wherein W(τi)={Wfi),Wii),Wci),Woi),WSi)},b(τi)={bfi),bii),bci),boi),bSi)} are each quantiles All weight parameter matrix sets and bias vector sets of the lower LSTM neural network; lambda 1 is a regularized penalty parameter that prevents model training from fitting,/>Is a checking function;
the gradient of hidden layer to output layer parameters is:
to hide layer state/> Differentiating the connection weight matrix W S of the hidden layer and the output layer,/>To hide layer state/>Differentiating the bias vector b S;
According to 、/>Respectively calculating gradients of parameters of the forgetting gate, the input gate, the candidate storage unit and the output gate;
(4) Updating the weight and the bias, wherein the formula is as follows:
Wherein eta is the learning rate, and W * and b * respectively represent the corresponding weight matrix and bias vector;
And (3) repeating the step (2) -the step (4) until convergence conditions are reached, and obtaining the optimal parameters { W (tau i), b(τi)}opt) of the model.
7. The method for predicting power load based on nested LSTM and quantile calculation as defined in claim 6, wherein in step 3, in order to obtain optimal parameters of constrained parallel LSTM model, a gradient descent method is used to search model parameters { W (τ i), b(τi)}opt; the training method of constrained parallel LSTM model is consistent with the quantile long-short-term memory network model Q-LSTM training method, except that there is a difference between the loss function and gradient, and the loss function of constrained parallel LSTM model is searched based on training sample setThe method comprises the following steps:
Wherein the method comprises the steps of
Penalty parameters for violating constraint conditions;
the gradient calculation of the forgetting gate, the input gate, the storage unit, the candidate storage unit and the output gate parameters is the same as the calculation mode in the step 2.
8. The method for predicting the power load based on nested LSTM and quantile calculation according to claim 7, wherein the prediction result in step 5 adopts an evaluation index which takes into account a quantile constraint relation to evaluate the crossing condition of quantiles, and the index which takes into account the quantile constraint relation is as follows:
Wherein the method comprises the steps of An evaluation index value indicating a constraint relation of the quantiles; /(I)Is t moment quantile/>The predicted value under the test, N is the total number of test moments, v t,i is a constraint violation degree function, and theta represents the step length between the sub-stations; v t,i is 0 when the constraint relation is satisfied between adjacent quantiles, and v t,i is a positive difference value of the adjacent quantiles when the constraint relation is violated, reflecting the degree of constraint violation; coefficient term 2 theta/N is a normalized coefficient of quantile constraint error square.
9. The method for predicting the power load based on nested LSTM and quantile calculation according to claim 8, wherein in step 6, a gaussian kernel density estimation method is used for the probability density curve of the predicted points obtained by the calculation.
CN202410049336.0A 2020-10-13 2020-10-13 Power load prediction method based on nested LSTM and quantile calculation Pending CN117977568A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410049336.0A CN117977568A (en) 2020-10-13 2020-10-13 Power load prediction method based on nested LSTM and quantile calculation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011092704.8A CN112232561B (en) 2020-10-13 2020-10-13 Power load probability prediction method based on constrained parallel LSTM fractional regression
CN202410049336.0A CN117977568A (en) 2020-10-13 2020-10-13 Power load prediction method based on nested LSTM and quantile calculation

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202011092704.8A Division CN112232561B (en) 2020-10-13 2020-10-13 Power load probability prediction method based on constrained parallel LSTM fractional regression

Publications (1)

Publication Number Publication Date
CN117977568A true CN117977568A (en) 2024-05-03

Family

ID=74113480

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202011092704.8A Active CN112232561B (en) 2020-10-13 2020-10-13 Power load probability prediction method based on constrained parallel LSTM fractional regression
CN202410049336.0A Pending CN117977568A (en) 2020-10-13 2020-10-13 Power load prediction method based on nested LSTM and quantile calculation

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202011092704.8A Active CN112232561B (en) 2020-10-13 2020-10-13 Power load probability prediction method based on constrained parallel LSTM fractional regression

Country Status (1)

Country Link
CN (2) CN112232561B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784435B (en) * 2021-02-03 2023-05-23 浙江工业大学 GPU real-time power modeling method based on performance event counting and temperature
CN113112092A (en) * 2021-05-07 2021-07-13 国网四川省电力公司经济技术研究院 Short-term probability density load prediction method, device, equipment and storage medium
CN113239029A (en) * 2021-05-18 2021-08-10 国网江苏省电力有限公司镇江供电分公司 Completion method for missing daily freezing data of electric energy meter
CN113449934B (en) * 2021-08-31 2021-11-30 国能日新科技股份有限公司 Wind power generation power prediction method and device based on data migration
CN113807432B (en) * 2021-09-16 2024-04-30 成都卡普数据服务有限责任公司 Air temperature forecast data correction method based on deep learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978201A (en) * 2017-12-27 2019-07-05 深圳市景程信息科技有限公司 Probability load prediction system and method based on Gaussian process quantile estimate model
CN108846517B (en) * 2018-06-12 2021-03-16 清华大学 Integration method for predicating quantile probabilistic short-term power load
CN109214605A (en) * 2018-11-12 2019-01-15 国网山东省电力公司电力科学研究院 Power-system short-term Load Probability prediction technique, apparatus and system
CN109558975B (en) * 2018-11-21 2021-04-13 清华大学 Integration method for multiple prediction results of power load probability density
CN111612244B (en) * 2020-05-18 2022-08-05 南瑞集团有限公司 QRA-LSTM-based method for predicting nonparametric probability of photovoltaic power before day

Also Published As

Publication number Publication date
CN112232561A (en) 2021-01-15
CN112232561B (en) 2024-03-15

Similar Documents

Publication Publication Date Title
CN111738512B (en) Short-term power load prediction method based on CNN-IPSO-GRU hybrid model
Wang et al. Deep belief network based k-means cluster approach for short-term wind power forecasting
Cao et al. Hybrid ensemble deep learning for deterministic and probabilistic low-voltage load forecasting
CN112232561B (en) Power load probability prediction method based on constrained parallel LSTM fractional regression
CN113962364B (en) Multi-factor power load prediction method based on deep learning
Shang et al. Short-term load forecasting based on PSO-KFCM daily load curve clustering and CNN-LSTM model
CN109359786A (en) A kind of power station area short-term load forecasting method
CN111260136A (en) Building short-term load prediction method based on ARIMA-LSTM combined model
CN109063911A (en) A kind of Load aggregation body regrouping prediction method based on gating cycle unit networks
CN106951611A (en) A kind of severe cold area energy-saving design in construction optimization method based on user's behavior
CN105069525A (en) All-weather 96-point daily load curve prediction and optimization correction system
CN105701572B (en) Photovoltaic short-term output prediction method based on improved Gaussian process regression
CN109492748B (en) Method for establishing medium-and-long-term load prediction model of power system based on convolutional neural network
CN111160659B (en) Power load prediction method considering temperature fuzzification
CN113554466B (en) Short-term electricity consumption prediction model construction method, prediction method and device
CN113537582B (en) Photovoltaic power ultra-short-term prediction method based on short-wave radiation correction
CN115130741A (en) Multi-model fusion based multi-factor power demand medium and short term prediction method
CN115860177A (en) Photovoltaic power generation power prediction method based on combined machine learning model and application thereof
CN105005708A (en) Generalized load characteristic clustering method based on AP clustering algorithm
CN112418476A (en) Ultra-short-term power load prediction method
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN115115125A (en) Photovoltaic power interval probability prediction method based on deep learning fusion model
CN116526473A (en) Particle swarm optimization LSTM-based electrothermal load prediction method
CN114117852B (en) Regional heat load rolling prediction method based on finite difference working domain division
CN116169670A (en) Short-term non-resident load prediction method and system based on improved neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination