CN114529051A - Long-term power load prediction method based on hierarchical residual self-attention neural network - Google Patents

Long-term power load prediction method based on hierarchical residual self-attention neural network Download PDF

Info

Publication number
CN114529051A
CN114529051A CN202210048738.XA CN202210048738A CN114529051A CN 114529051 A CN114529051 A CN 114529051A CN 202210048738 A CN202210048738 A CN 202210048738A CN 114529051 A CN114529051 A CN 114529051A
Authority
CN
China
Prior art keywords
sequence
data
neural network
load
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210048738.XA
Other languages
Chinese (zh)
Inventor
占翔昊
寇亮
张纪林
周丽
袁俊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202210048738.XA priority Critical patent/CN114529051A/en
Publication of CN114529051A publication Critical patent/CN114529051A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Computational Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Optimization (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Power Engineering (AREA)
  • Biophysics (AREA)
  • Development Economics (AREA)
  • Biomedical Technology (AREA)
  • Game Theory and Decision Science (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)

Abstract

The invention discloses a long-term power load prediction method based on a hierarchical residual error self-attention neural network. The invention comprises three parts: firstly, mixed characteristic data of a trend item, a period item, a holiday item and a weather item in historical load data are extracted in a self-adaptive mode and fused with the historical load sequence data. Secondly, carrying out time component recursive decomposition on the fused sequence data, encoding the time component by utilizing a hierarchical residual self-attention network block, and thirdly, reconstructing the time component, carrying out generative decoding and predicting the power load fluctuation in a period of time in the future. According to the invention, the load sequence is disassembled, reconstructed and predicted in a layering manner, the long-term and short-term characteristics of the sequence are effectively captured, and the prediction precision of the model in a long-sequence load prediction scene is improved.

Description

Long-term power load prediction method based on hierarchical residual self-attention neural network
Technical Field
The invention relates to the technical field of load prediction of a power energy system, in particular to a long-term power load prediction method based on a hierarchical residual error self-attention neural network
Background
The power load prediction technology is an indispensable part of services in the composition of an intelligent power grid system, is actively applied to a plurality of scenes, and how to effectively control the power load to achieve balance of supply and demand becomes an important research direction in the operation and management of a modern power system. The core problem of load prediction is how to obtain the historical change rule of the prediction object and the relation between the historical change rule and some influence factors, the prediction model is actually a mathematical function expressing the change rule, and the challenge of load prediction is that the load prediction is influenced by a plurality of external factors, including power trading market factors, national policy factors, weather factors, residential electricity habit factors and the like, which are problems to be solved
Models for load prediction can be essentially categorized as mathematical models for time series prediction, and common methods can be divided into: traditional statistical methods, machine learning based methods, deep learning based methods, and third party tool prediction methods. (1) Based on the traditional statistical methods, the common time series models including Auto Regression model (AR), Auto Regression Moving Average model (ARMA), etc. have simple principles, are suitable for analyzing the stable sequences and simple non-stable sequences under a small number of orders, but are not suitable for solving the nonlinear prediction scenes, (2) based on the machine learning method, the machine learning is a very large gate class, wherein a plurality of models suitable for solving the nonlinear prediction are available, the common model includes Support Vector Machines (Support vectors, SVM), decision tree models, K neighbor models, etc., and even integrated learning models (XGBoost, LightGBM), etc. with better prediction capability, the machine learning model well solves the nonlinear problem, but is subject to the characteristic mining capability under a large number of levels and high-dimensional data prediction scenes, often, the data features are manually processed to build a machine learning prediction model. (3) Based on the deep learning method, the deep learning model can adaptively mine and learn data characteristics due to strong fitting ability, and is very suitable for solving the problem of nonlinear prediction, common methods include Convolutional Neural Networks (CNN), Long Short Term Memory Networks (LSTM), gated-round units (GRU), and the like, wherein the Recurrent Neural Networks represented by LSTM and GRU are widely used in sequence modeling and have good sequence ability, but the Recurrent Neural Networks gradually lose learning ability to Long-distance historical characteristics in the training process due to serial learning and have error accumulation phenomenon, so the Recurrent Neural Networks are often used together with other deep learning models, (4) third-party tool prediction method, in recent years, some large-scale domestic and foreign companies also open time sequence prediction methods of self-research thereof, for example, FaceBook has launched a Prophet model in 2017, the model comprehensively considers trend terms, period terms and holiday terms of time series, the model is simple to use and has stable prediction capability, then Amazon has launched a deepAR model in 2018, the model uses an autoregressive reasoning mode based on probability, uncertainty in the prediction process is reduced, the prediction accuracy of the tools is remarkable, but only short-term prediction can be achieved, and the method is not suitable for energy load scenes with high real-time performance and strong stability
Disclosure of Invention
The invention aims to combine the prior art and improve the prior art to optimize the modeling effect of a load prediction model in a power load prediction scene, specifically, the invention uses a neural network mode to model, and provides a network structure based on a hierarchical residual error self-attention mechanism, which is used for long-sequence prediction of stable and highly periodic power load data
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a long-term power load prediction method based on a hierarchical residual self-attention neural network comprises the following steps:
step 1, acquiring source data of unit load sequence and weather data monitored by a sensor from a time sequence database
And 2, cleaning the source data, extracting features from the cleaned historical load data and weather data, respectively extracting four features of a trend item, a period item, a holiday item and a weather item of load fluctuation, fusing the historical load sequence data and the feature data to obtain a fusion vector, and inputting the fusion vector for the next neural network modeling
Step 3, the hierarchical residual error self-attention neural network provided by the invention is used for coding the input sequence, extracting and mining important characteristics in the input sequence, and performing model training
And 4, carrying out generative coding on the characteristics extracted from the source historical load data to be predicted, and predicting the load sequence in the next time step range.
The invention has the beneficial effects that: the model provided by the invention is based on a Transformer neural network, a self-attention mechanism is used in the network structure, and compared with the traditional recurrent neural network, the model has the capability of capturing global features. Compared with the traditional method, the method is more excellent and flexible in feature mining capability and model generalization capability, the prediction error of the medium-term and long-term load prediction can be well reduced by realizing the load prediction through the method, feedback guidance is provided for operation and allocation of the power unit, and the stable operation of the power system is ensured.
Drawings
Fig. 1 is a schematic flowchart of a long-term power load prediction method based on a hierarchical residual error self-attention neural network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an overall framework of a hierarchical residual error-based self-attention neural network prediction model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a framework of a transform neural network model;
FIG. 4 is a block diagram of a residual neural network model;
FIG. 5 is a block diagram of an embodiment of the present invention in which each layer of the modified residual self-attention block is framed;
FIG. 6 is a block diagram of prediction using generative decoding according to an embodiment of the present invention;
Detailed Description
The invention is further explained with reference to the attached drawings, the flow chart of the implementation of the invention is shown in figure 1,
step 1. determining a start time TstartAnd an end time TendReading the load data X in the appointed time range from the database storing the unit load sequence by using middleware service or data analysis softwarerawSimilarly, the weather data X collected by the sensor is readweatherAnd skipping to the step 2.
Step 2, extracting sequence characteristic data from the historical load data and the weather data and performing data fusion with the source historical load data, wherein the method comprises the following steps:
and 2-1, extracting weather data characteristics.
The method comprises the steps of coding weather data collected by a sensor, analyzing the data of the collected data at least including temperature data, weather state data, timestamp data and the like, and eliminating the abnormal condition with excessive deviationData, to temperature data Xweather(T) performing a maximum-minimum normalization, wherein the normalization function is expressed as:
Figure BDA0003472894810000031
similarly, for other numerical weather related data, the method can be used for carrying out feature normalization to effectively help subsequent feature fusion, and for weather state data Xweather(S) is often a category type tag, shaped as [ sunny, cloudy, light rain, heavy rain, light snow, …]For such data, a one-hot encoding method is adopted to convert the data into data of a numerical type, specifically, each tag is encoded into a unique numerical value, and the one-hot encoding is expressed as follows:
status of state In sunny days Cloudy Light rain Heavy Rain Small snow ……
Encoding a value 0 1 2 3 4 ……
The characteristic processing of the weather data can be realized through the method.
Step 2-2, extracting trend item characteristics and periodic item characteristics of historical load sequence
The historical load sequence characteristics are main factors influencing the trend of a future sequence, the time sequence is subjected to characteristic decomposition through a shallow neural network aiming at the nonlinear and time-varying characteristics of the time sequence, data cleaning is required to be carried out before decomposition, the data is analyzed, the value with overlarge offset is eliminated, specifically, a mixed sequence decomposition layer neural network is defined, and the original input is XinputThe trend term feature and the period term feature may be generated according to the following process:
Xtrend=MovingAvg(Xinput)
Xperiod=Xinput-Xtrend
wherein MovingAvg is a moving average function, obtained by using an average pooling operation of one-dimensional convolution, through which a trend term of the whole sequence fluctuation can be obtained, and then a period term can be obtained by subtracting the trend term from the original sequence
Step 2-3, extracting the holiday term characteristics of the historical load sequence
In load prediction, the presence of important holidays also affects the trend of the load to some extent, specifically, the time stamp X for the original load data extracted in step 1timestampAnalyzing the data by using the pandas and numpy libraries of python language, and calculating the expansion characteristics of the date of each time stamp, including the month X of the datemonthDay number XdayHour XhourMinute XminuteDay of week XweekdayWhether it is a workday XisworkWhether it is a holiday XisholidayWhether it is a double breakDay XisweekendAnd (4) analyzing the time by using a DataFrame library of the pandas, and expressing the characteristics of finer granularity as follows
Xmonth,Xday,Xhour,Xminute,...=Extend(Xtimestamp)
Xtimestamp=Linear(Extend(Xtimestamp))
Wherein Extend is a feature extension function, and converts the extended multidimensional features into a data form with the same dimension as a source sequence through a nonlinear conversion layer for subsequent feature fusion
Step 2-4. feature embedding fusion
Through the first three steps, step 2-1, 2-2, 2-3, the existing feature data set, X, can be obtainedweather,Xtrend,Xperiod,Xtrend,XtimestampThese features are then fused, here using an additive model, for subsequent hierarchical residual neural network inputs, which are expressed as:
Figure BDA0003472894810000041
dropout is a common neuron inactivation rate function in neural network modeling, and aims to prevent the occurrence of overfitting, RELU is a common activation function, and finally, fused features can be obtained through an additive model
Step 3, the hierarchical residual error self-attention neural network provided by the invention is used for coding the input sequence, extracting and mining important features in the input sequence, the overall schematic diagram of the model is shown in fig. 2, and in the embodiment, the step 3 specifically comprises the following sub-steps:
and 3-1, decomposing the sequence characteristics.
The invention has the major innovation that a layered decomposition sequence modeling process is used for replacing the traditional linear modeling process, the characteristic sequence is subjected to recursive decomposition continuously according to the number of layers, then a residual self-attention network is used for modeling decomposition characteristics of each layer, and finally better characteristic expression can be trained in a deeper layer, specifically, a decomposition algorithm provided by the invention comprises parity decomposition and dichotomy decomposition, wherein pseudo code expression of the algorithm is as follows:
Figure BDA0003472894810000051
wherein the content of the first and second substances,
Figure BDA0003472894810000052
the method is characterized in that the method is a mixed characteristic sequence of source input, Level is the number of preset layers, Splitseries is a sequence decomposition function, a default algorithm provided by the invention adopts dichotomy decomposition, and two decomposed characteristic components X are obtainedleft,XrightRespectively input into the residual block to be updated to obtain
Figure BDA0003472894810000053
Then, continuing to adopt the algorithm 1 to carry out recursive decomposition until reaching the limit of the number of layers, and finally returning the combined sequence by using the Merge function
Step 3-2, extracting information of characteristic components by using hierarchical residual error self-attention neural network
The prototype of the hierarchical residual self-attention neural network provided by the invention is a Transformer network, the architecture diagram of which is shown in fig. 3, specifically, the invention uses a self-attention mechanism, compared with LSTM and GRU, the network has the potential of mining the dependency between time sequences, and the self-attention mechanism emphasizes the global state and better prevents the information loss. The method comprises the steps of transforming an original Transformer in consideration of training time and prediction accuracy, specifically, replacing a feedforward neural network in an original Transformer encoder with a convolutional network with a smaller parameter, adding more cross-layer residual errors to stabilize gradient change during model training aiming at a hierarchical structure proposed by the design, simplifying a Transformer decoder layer, replacing the basic structure with a combination of a full connection layer and a Gaussian error function, and enabling an integrally modified framework to be as shown in FIG. 5
At each layer, the characteristic component X is divided intoinputInputting the time sequence characteristic information into a model to obtain time sequence characteristic information X with time sequence dependencydepExpressed as:
Xdep=ResidualAttentionBlock(Xinput)
the step 3-2 specifically comprises the following steps:
step 3-2-1, dividing each layer into single time characteristic component XinputInputting the residual error into a multi-head residual error self-attention block to obtain a coded characteristic XemdedThe multi-headed residual self-attention mechanism is expressed as:
ResidualMultiHead(H)=Concat(head1,head2,...headn)Wo
wherein, ResidualMultiHead represents multi-head residual error self-attention layer, H represents the number of attention heads, WoRepresenting weight vectors, i.e. non-linearly transforming the fused feature vectors of the plurality of headers to map to a specified length, head1,head2,...headnRepresenting the output from the attention layer for each head, Concat is a tensor splicing function, and the computation for each head is expressed as follows:
Figure BDA0003472894810000061
wherein
Figure BDA0003472894810000062
Qi,Ki,ViIs obtained by non-linear conversion after encoding the input data in each head, PreviThe probability matrix calculated by the multi-head self-attention layer of the previous layer is transmitted to the next layer, stable and excellent performance can still be obtained under a deep network structure, and the final fusion characteristic X is obtained by using a plurality of headsattnThese variables are represented as follows:
Figure BDA0003472894810000063
Figure BDA0003472894810000064
Figure BDA0003472894810000065
Figure BDA0003472894810000066
Figure BDA0003472894810000067
step 3-2-2: inputting the output characteristics of the multi-head self-attention Layer into a first Layer regularization Layer1 to generate a characteristic vector Xnorm1And generates a copy X thereofnorm2Is mixing Xnorm1Inputting the code vector X into a second layer one-dimensional convolution network to obtain a code vector XconvIs mixing XconvAnd Xnorm2Connecting, generating a coded time characteristic component Z which is transmitted to the next Layer of the self-attention Layer through a second Layer regularization Layer2, and simultaneously calculating a probability matrix Prev in the step 3-2-1iAlso passed to the next layer, the relative expression is as follows:
Xnorm1=NormalizationLayer1(Xattn)
Xnorm2=Xnorm1
Xconv=Dropout(Relu(Conv1d(Xnorm1)))
Z=NormalizationLayer2(Xconv+Xnorm2)
step 3-2-3. repeat steps 3-2-1 and 3-2-2, using the same operation in the residual attention unit of each layer stack in the encoder section in the hierarchical residual block.
And 3-2-4, inputting the vector Z finally coded by the coder into a decoder for decoding, wherein the decoder is improved from a traditional transform structure and is properly simplified, and the expression is as follows:
Z=Gelu(Linear(Dropout(Z)))
wherein, Dropout is a hyper-parameter, represents the neuron deactivation rate in the neural network, and plays a role of preventing overfitting, Linear is a simple nonlinear transfer function, GELU is a Gaussian error Linear unit which has better performance in sequence modeling, and the comprehensive performance is the most excellent under a plurality of scenes, and the expression mode is as follows:
Figure BDA0003472894810000071
the time component Z is transmitted to the next layer residual error self-attention block by the time component characteristic after decoding by the decoder and having very good context expression capability.
Step 3-2-5. the steps 3-2-1, 3-2-2, 3-2-3, 3-2-4 are cycled until the sequence can not be divided (the requirement of layer number is achieved)
3-2-6, time series reconstruction, namely through the steps from 3-2-1 to 3-2-5, the original time component characteristics are already segmented into a plurality of time component characteristics with the same length, and reduction is carried out according to the relative position sequence of the original characteristics, wherein the following steps are respectively a segmentation and reconstruction algorithm flow adopting an odd-even segmentation strategy and a binary segmentation strategy:
Figure BDA0003472894810000072
Figure BDA0003472894810000081
compressing the reconstructed sequence in the above way, and taking the compressed sequence value and the real sequence value as the mean square errorA loss function is used to update parameters of the neural network, thereby training the network, setting the compression length as embed _ len, and using X as the compressed vectorembedTo express
Xembed=Embed(XT,embed_len)
Finally, updating model parameters by taking mean-square error (MSE) as a loss function
Figure BDA0003472894810000082
Wherein
Figure BDA0003472894810000083
Is a predicted value, X for the training phaseembedIs represented by YTTrue value, X for training phasetrueTo represent
Step 4, setting a prediction step size, performing generative decoding, predicting a load sequence in the next time range, specifically, assuming that this step has already acquired the reconstructed sequence feature data to be predicted, it is also necessary to set a compression length embed _ len, where the compression length is smaller than the length sequence _ len of the reconstructed sequence, where the length of the reconstructed sequence is set by default to 96 and the compression length is 48, by which the reconstructed sequence is compressed to the end part of the specified length, as illustrated in fig. 6, the following is an expression of the whole process:
Xembed=Embed(XT,embed_len)
after obtaining the compressed sequence, the invention carries out long sequence prediction by proposing a generative decoding mode, initializes the full zero tensor X with the same dimension as the prediction length by setting the prediction length prediction _ lenzeroIs mixing XembedAnd XzeroPerforming horizontal splicing, performing compression again, wherein the length of the compression is predicted _ len, and generating load prediction X of the historical sequencepred
Xpred=Embed(Concat(Xembed,Xzero),predict_len)
The above is the preferred implementation process of the present invention, and all the changes made according to the present invention technique, which produce the functional effects that do not exceed the scope of the present invention technical solution, belong to the protection scope of the present invention.

Claims (5)

1. The long-term power load prediction method based on the hierarchical residual error self-attention neural network is characterized by comprising the following steps of:
step 1, acquiring source data of a unit load sequence and weather data monitored by a sensor from a time sequence database;
step 2, performing data cleaning on the source data, performing feature extraction from the cleaned historical load data and weather data, and respectively extracting four major features of a trend item, a period item, a holiday item and a weather item of load fluctuation;
performing data fusion on the historical load sequence data and the weather characteristic data to obtain a fusion vector for the input of the next neural network modeling;
step 3, encoding the input sequence by using a hierarchical residual self-attention neural network, extracting and mining important features in the input sequence, and performing model training;
step 4, carrying out generative coding on the characteristics extracted from the source historical load data to be predicted, and predicting a load sequence in the next time step range;
extracting integral trend term and periodic term characteristics from the original load sequence by using a convolution neural network; performing feature extraction on the holiday term and the weather term by using a one-hot coding mode, performing horizontal splicing on the source load sequence and all extracted feature data by using an additive idea, and performing conversion through a full connection layer to obtain a fused time sequence feature vector;
step 3, adopting a recursion idea, hierarchically performing feature downsampling decomposition on the time sequence feature vector, performing feature mining on the time sequence component after decomposition of each layer by using a residual self-attention network, recombining the mined features according to the original relative positions on the basis of reaching the decomposition depth, converting the mined features into a prediction result through a one-dimensional convolution layer, continuously iterating according to the mode, using an Adam algorithm as an optimization algorithm, and using a mean square error between a predicted value and a true value as a loss function to perform model training;
and 4, specifically, performing feature conversion on source load data to be predicted through the steps 2 and 3, splicing the converted features with an all-zero vector initialized to the prediction length, performing generative coding on the spliced vector through the model trained in the step 3, and predicting the load sequence fluctuation of the whole section in the future.
2. The long-term power load prediction method based on hierarchical residual self-attention neural network according to claim 1, characterized in that: the period term is obtained by subtracting the trend term from the original load sequence.
3. The long-term power load prediction method based on hierarchical residual self-attention neural network of claim 1, characterized in that: the prototype of the hierarchical residual self-attention neural network is a Transformer network, a feedforward neural network of an encoder in the original Transformer network is replaced by a convolutional network, more cross-layer residual connections are added to stabilize gradient changes during model training, a decoder layer in the Transformer network is simplified, and the combination of a full connection layer and a Gaussian error function is replaced.
4. The long-term power load prediction method based on hierarchical residual self-attention neural network of claim 1, characterized in that: and 4, splicing in the step 4 adopts horizontal splicing and compression.
5. The long-term power load prediction method based on hierarchical residual self-attention neural network of claim 4, wherein: the compressed length is the predicted length.
CN202210048738.XA 2022-01-17 2022-01-17 Long-term power load prediction method based on hierarchical residual self-attention neural network Pending CN114529051A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210048738.XA CN114529051A (en) 2022-01-17 2022-01-17 Long-term power load prediction method based on hierarchical residual self-attention neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210048738.XA CN114529051A (en) 2022-01-17 2022-01-17 Long-term power load prediction method based on hierarchical residual self-attention neural network

Publications (1)

Publication Number Publication Date
CN114529051A true CN114529051A (en) 2022-05-24

Family

ID=81620165

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210048738.XA Pending CN114529051A (en) 2022-01-17 2022-01-17 Long-term power load prediction method based on hierarchical residual self-attention neural network

Country Status (1)

Country Link
CN (1) CN114529051A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707772A (en) * 2022-06-06 2022-07-05 山东大学 Power load prediction method and system based on multi-feature decomposition and fusion
CN115204529A (en) * 2022-09-15 2022-10-18 之江实验室 Non-invasive load monitoring method and device based on time attention mechanism
CN115440390A (en) * 2022-11-09 2022-12-06 山东大学 Method, system, equipment and storage medium for predicting number of cases of infectious diseases
CN116029201A (en) * 2022-12-23 2023-04-28 浙江苍南仪表集团股份有限公司 Gas flow prediction method and system based on clustering and cyclic neural network
CN116776228A (en) * 2023-08-17 2023-09-19 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system
CN117114056A (en) * 2023-10-25 2023-11-24 城云科技(中国)有限公司 Power load prediction model, construction method and device thereof and application

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707772A (en) * 2022-06-06 2022-07-05 山东大学 Power load prediction method and system based on multi-feature decomposition and fusion
CN114707772B (en) * 2022-06-06 2022-08-23 山东大学 Power load prediction method and system based on multi-feature decomposition and fusion
CN115204529A (en) * 2022-09-15 2022-10-18 之江实验室 Non-invasive load monitoring method and device based on time attention mechanism
CN115204529B (en) * 2022-09-15 2022-12-20 之江实验室 Non-invasive load monitoring method and device based on time attention mechanism
CN115440390A (en) * 2022-11-09 2022-12-06 山东大学 Method, system, equipment and storage medium for predicting number of cases of infectious diseases
CN115440390B (en) * 2022-11-09 2023-03-24 山东大学 Infectious disease case quantity prediction method, system, equipment and storage medium
CN116029201A (en) * 2022-12-23 2023-04-28 浙江苍南仪表集团股份有限公司 Gas flow prediction method and system based on clustering and cyclic neural network
CN116029201B (en) * 2022-12-23 2023-10-27 浙江苍南仪表集团股份有限公司 Gas flow prediction method and system based on clustering and cyclic neural network
CN116776228A (en) * 2023-08-17 2023-09-19 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system
CN116776228B (en) * 2023-08-17 2023-10-20 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system
CN117114056A (en) * 2023-10-25 2023-11-24 城云科技(中国)有限公司 Power load prediction model, construction method and device thereof and application
CN117114056B (en) * 2023-10-25 2024-01-09 城云科技(中国)有限公司 Power load prediction model, construction method and device thereof and application

Similar Documents

Publication Publication Date Title
CN114529051A (en) Long-term power load prediction method based on hierarchical residual self-attention neural network
CN113592185B (en) Power load prediction method based on Transformer
CN112364975B (en) Terminal running state prediction method and system based on graph neural network
CN111079989B (en) DWT-PCA-LSTM-based water supply amount prediction device for water supply company
CN115169703A (en) Short-term power load prediction method based on long-term and short-term memory network combination
CN114493014A (en) Multivariate time series prediction method, multivariate time series prediction system, computer product and storage medium
CN115587454A (en) Traffic flow long-term prediction method and system based on improved Transformer model
CN113128113A (en) Poor information building load prediction method based on deep learning and transfer learning
CN114817773A (en) Time sequence prediction system and method based on multi-stage decomposition and fusion
CN114519471A (en) Electric load prediction method based on time sequence data periodicity
CN116702831A (en) Hybrid short-term wind power prediction method considering massive loss of data
CN113360848A (en) Time sequence data prediction method and device
CN117494906B (en) Natural gas daily load prediction method based on multivariate time series
Liao et al. Scenario prediction for power loads using a pixel convolutional neural network and an optimization strategy
CN115713044B (en) Method and device for analyzing residual life of electromechanical equipment under multi-condition switching
WO2024012735A1 (en) Training of a machine learning model for predictive maintenance tasks
CN116911442A (en) Wind power generation amount prediction method based on improved transducer model
Rodriguez et al. Multi-step forecasting strategies for wind speed time series
CN116127325A (en) Method and system for detecting abnormal flow of graph neural network business based on multi-attribute graph
Wang et al. Grid load forecasting based on dual attention BiGRU and DILATE loss function
CN116128082A (en) Highway traffic flow prediction method and electronic equipment
Rathnayaka et al. Specialist vs generalist: A transformer architecture for global forecasting energy time series
CN113780377A (en) Rainfall level prediction method and system based on Internet of things data online learning
Chen et al. Multi-Objective Spiking Neural Network for Optimal Wind Power Prediction Interval
Han et al. Online aware synapse weighted autoencoder for recovering random missing data in wastewater treatment process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination