CN114154700A - User power consumption prediction method based on transformer model - Google Patents

User power consumption prediction method based on transformer model Download PDF

Info

Publication number
CN114154700A
CN114154700A CN202111411790.9A CN202111411790A CN114154700A CN 114154700 A CN114154700 A CN 114154700A CN 202111411790 A CN202111411790 A CN 202111411790A CN 114154700 A CN114154700 A CN 114154700A
Authority
CN
China
Prior art keywords
variables
data
input
power consumption
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111411790.9A
Other languages
Chinese (zh)
Other versions
CN114154700B (en
Inventor
王鑫
宗珂
王霖
梁勇杰
闫昆鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202111411790.9A priority Critical patent/CN114154700B/en
Publication of CN114154700A publication Critical patent/CN114154700A/en
Application granted granted Critical
Publication of CN114154700B publication Critical patent/CN114154700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Human Resources & Organizations (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a power consumption prediction method based on a transformer model, which comprises the following steps: determining characteristic variables for predicting power consumption, wherein the characteristic variables comprise static variables, past known dynamic time-varying variables and future variable dynamic time-invariant variables which can be predicted; adopting a gating mechanism to carry out weight calculation on a plurality of input variables according to the information contribution degree so as to improve the utilization rate of useful variables; performing feature extraction on input data by adopting sparse attention; adopting a gating residual error network to select linear or nonlinear processing to the data in the data set according to the situation; and (5) adopting multi-head sparse attention to construct a decoder, and predicting the power consumption data according to the input characteristics. According to the power consumption prediction method based on the transformer, unreliable data in training data can be restrained at the input end, useful information is concentrated, the utilization rate of the information can be dynamically adjusted in the model training process, the training effect of the model is improved, and a better power consumption prediction effect is achieved.

Description

User power consumption prediction method based on transformer model
Technical Field
The invention relates to the technical field of data management and control in the power metering industry, in particular to a user power consumption prediction method based on a transformer model.
Background
With the increasing of power users and the expanding of power grid services, the construction of smart power grids needs to be really developed by efficient and effective smart technology application. The electricity consumption is used as main electricity consumption information of the user and is an important index for intelligent power grid construction, the electricity consumption rule of the user is mastered, accurate prediction is made on electricity consumption data, the electric power construction can be planned, and an auxiliary decision making effect of the intelligent power grid is exerted.
The current commonly used power consumption prediction method, such as using an autoregression or LSTM model, is used for predicting short-term power consumption, and the practical value is not high. When the methods are used for predicting the long-term electricity consumption, the problem of information loss of long time sequence data exists.
The Transformer model is a structure model based on encoder-decoder and Self-orientation proposed by Google in 2017, replaces the conventional RNN network structure, and can obtain a better long sequence data prediction effect. However, the power consumption data has the characteristics of long sequence, multiple dimensions and large volume, and the traditional transform model for processing the data often has the problems of complex calculated amount and bottleneck of information extraction effect due to overhigh data dimension.
Disclosure of Invention
In order to overcome the defects in the prior art, a method for predicting the power consumption of a user based on a transform model is provided.
The invention provides a transform power consumption prediction model based on sparse attention and a gating mechanism, which is used for carrying out sparse calculation on traditional multi-head self-attention, only a plurality of front-ranked attention scores are used as effective attention, the traditional mode of using global attention is changed, an input layer adopts multi-type power consumption time sequence data, the gating mechanism is respectively adopted for each type of data, different calculation weights are given to variables in the input data according to contribution degrees, and meanwhile, the gating mechanism can also carry out necessary nonlinear processing on the data in the model so as to fully utilize data information and realize accurate prediction on the power consumption of a user.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
a user electricity consumption prediction method based on a transformer model comprises the following steps:
step (1), inputting layer multivariable input: the power consumption of the user is often influenced by multiple factors, and the accuracy of power consumption prediction can be improved by using various appropriate variables as input and extracting time characteristics. The method adopts a plurality of types of variables as input on an input layer, and divides input power consumption time sequence data into three types, namely static variables, past known input and input which can be speculated in the future, wherein the static variables comprise region variables and industry variables, and the part of data is irrelevant to time; past known time series inputs, belonging to dynamic time-varying variables, including power usage, load and temperature; the future known time series input belongs to dynamic time-invariant variables, including variables such as weekends, holidays and the like.
And (2) performing weight calculation on the input variable by using a GRN gating mechanism: aiming at the power consumption prediction model training process, the training data set adopts more variables, theoretically more abundant data variables can enable the model to obtain more comprehensive characteristic information, but in practice, part of the model training effect is influenced by the data quality of the training data set. The invention uses different independent gating mechanisms for all static, past and future inputs to calculate the flat vector of all historical input variables at the time t
Figure BDA0003374398960000021
Different weights are given to variable data of the model input data according to the contribution degree by adopting a gated residual error module GRN, and the weight calculation is shown in the following formula:
Figure BDA0003374398960000031
and (3) performing feature extraction by adopting sparse attention: in a traditional transform model, attention calculation is performed by adopting full-scale calculation, namely attention calculation on current data needs to be performed on all data around an input sequence, the traditional method is usually large in calculation dimension, and some useless information also participates in the calculation process. The invention is based on the traditional self-attention zooming dot-product operation, and executes the dot-product operation on the triple input (query, key, value), and generates the attention score as shown in the following formula:
Figure BDA0003374398960000032
based on the assumption that a higher score indicates a higher correlation, we evaluate the value of the score P. Supposing that k scores before ranking are selected as effective scores to obtain queryiAnd keyjThe set P is sorted, the score of k before ranking is preserved, otherwise the score is set to be infinitesimal, as shown below:
Figure BDA0003374398960000033
and (4) dynamically processing data information by a gating residual error module: in order to acquire information of a variable, it is generally necessary to perform nonlinear processing on the variable and to change the degree of extraction of variable information by grasping the degree of nonlinear processing. The invention adopts a gate control linear unit GTU and standardized normalization processing to construct a gate control residual GRN module, dynamically processes data information, and has the following formula:
GRNω(x)=LayerNorm(x+GTUω(θ)) (4)
θ=ELU(xW’ω+a) (5)
Figure BDA0003374398960000034
step (5) constructing a three-layer decoder using the gated residual module and sparse attention: and the decoder is responsible for calculating the power consumption output value according to the extracted static variable and time variable characteristics. The invention uses a gate-controlled residual error network and sparse attention to construct a three-layer decoder structure, wherein the middle layer uses sparse attention to calculate time characteristic sequence data attention, the upper layer and the lower layer use the gate-controlled residual error network, the upper layer mainly carries out information concentration on static data, and the lower layer carries out nonlinear processing on the output of the attention layer, so that model outputs phi to (t, n) are obtained through simplification:
Φ(t,n)=GRNφ(D(t,n)) (7)
Figure BDA0003374398960000041
the invention combines sparse attention and a gating mechanism to construct a transformer long-time sequence power consumption data prediction model, and further improves the long-sequence information extraction capability and the calculation speed of the transformer model.
The invention has the advantages and positive effects that:
according to the invention, static variables, past known variables and future known variables are used as model input variables, a gating mechanism is respectively adopted for each type of variable to give weight to the variable according to the variable information contribution degree, time characteristics are extracted through a transformer sparse attention encoder, sparse attention and the gating mechanism are combined to construct a three-layer decoder structure, time series data are decoded, accurate power consumption value is predicted, and the long sequence information extraction capability and the calculation speed of the transformer model are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating an implementation of a method for predicting power consumption of a user based on a transform model according to an embodiment of the present invention;
FIG. 2 is a diagram of a model structure for predicting power consumption of a user based on a transform model according to an embodiment of the present invention;
FIG. 3 is a diagram of a component of a model for predicting power consumption of a user based on a transform model according to an embodiment of the present invention;
fig. 4 is a diagram illustrating a user power consumption prediction based on a transform model according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart illustrating an implementation of a method for predicting power consumption of a user based on a transform model according to an embodiment of the present invention. The flow chart shows the steps of power usage prediction using a transformer power usage prediction model. By using the power consumption prediction model disclosed by the invention, a static variable, a past known variable and a future known variable are used as input, information correlation is improved through information selection of a gate control mechanism and characteristic extraction of sparse attention, the purpose of accurate power consumption prediction is achieved, and fault data are detected and cleaned according to the power consumption prediction value.
The implementation flow chart 1 comprises the following steps:
step 1: adopting static variables, past known inputs and future presumable inputs, wherein the static variables comprise regional variables and industry variables, and the part of data is independent of time; past known time series inputs, belonging to dynamic time-varying variables, including power usage, load and temperature; the future known time series input belongs to dynamic time-invariant variables, including variables such as weekends, holidays and the like. Multiple levels of input variables may help the model to fully capture temporal features.
Step 2: the method is limited by data quality, not all variables can have positive influence on model training, each type of variable can be sent to a corresponding gating mechanism, the information contribution degree of the variable is calculated through the gating mechanism, and different weights are given to the variable, so that useful data information is reserved, and low-quality and even invalid data are prevented from entering a network.
And step 3: the method comprises the steps that power consumption time sequence data subjected to information screening enter an encoder to extract time characteristics, multiple-head sparse attention is adopted in the encoder, attention at the current moment is calculated through attention scores of surrounding moment data, and k bits of data before the scores are selected as effective associated data, so that the purpose is to further concentrate useful information and reduce attention divergence in the attention process.
And 4, step 4: and (3) dynamically processing data information by a gated residual error network: in order to acquire information of a variable, it is generally necessary to perform nonlinear processing on the variable and to change the degree of extraction of variable information by grasping the degree of nonlinear processing. The invention is based on a gate-controlled linear unit to enable the model to control the contribution degree of the model to the input variable through a gate-controlled residual error network
And 5: a three-layer decoder structure is constructed based on a gated residual error network and sparse attention, the middle layer calculates time characteristic sequence data attention by applying sparse attention, the upper layer and the lower layer use the gated residual error network, the upper layer mainly carries out information concentration on static data, and the lower layer carries out nonlinear processing on the output of the attention layer, so that model output is simplified.
Fig. 2 is a diagram of a power consumption prediction model structure based on a transform model according to an embodiment of the present invention, and the specific process is as follows:
1. as shown in FIG. 3, static variables including regional variables and industry variables, portions of data and time, past known time-series variables and future speculatable variables are used as inputsIrrelevant; past known time series inputs, belonging to dynamic time-varying variables, including power usage, load and temperature; the future known time series input belongs to dynamic time-invariant variables, including variables such as weekends, holidays and the like. And is set as input at time t
Figure BDA0003374398960000061
Figure BDA0003374398960000062
The output is the corresponding prediction sequence
Figure BDA0003374398960000063
Wherein
Figure BDA0003374398960000064
Representing the value of the ith input variable at time t,
Figure BDA0003374398960000065
representing the ith predicted value at time t.
2. All static, past and future inputs use different independent gating mechanisms. Let
Figure BDA0003374398960000066
Represents the ith input variable at time t,
Figure BDA0003374398960000067
is the flattened vector for all historical input variables at time t.
For time t, we input data of each variable at time t
Figure BDA0003374398960000068
Sending the GRN of the user:
Figure BDA0003374398960000069
wherein
Figure BDA0003374398960000071
Is the feature vector after the variable i is processed, the weight is shared at all times. Will rhotThe sender GRN gating residual error network generates variable selection weights through a Softmax layer:
Figure BDA0003374398960000072
wherein
Figure BDA0003374398960000073
Is a vector of variable selection weights.
Obtaining an output variable passing through a gating mechanism at the time t:
Figure BDA0003374398960000074
3. based on the conventional self-attention scaling dot-product operation, the dot-product operation is performed on the triple input (query, key, value), and the attention score is generated as follows:
Figure BDA0003374398960000075
based on the assumption that a higher score indicates a higher correlation, 00000 we evaluated the value of the score P. Supposing that k scores before ranking are selected as effective scores to obtain queryiAnd keyjThe set P is sorted, the score of k before ranking is preserved, otherwise the score is set to be infinitesimal, as shown below:
Figure BDA0003374398960000076
since the score ranked after k is set to infinitesimal, it needs to be normalized to approximately 0 using the softmax function, the normalized attention score being:
A=softmax(P~(Q,K)) (11)
the output representation of self-attention C can be calculated as follows:
C=AV (12)
4. constructing a gating residual error network component based on a gating linear unit GTU and standard normalization processing, and performing nonlinear processing on data as required to obtain useful information and inhibit invalid information:
GRNω(x)=LayerNorm(x+GTUω(θ)) (4)
θ=ELU(xW’ω+a) (5)
Figure BDA0003374398960000077
where x refers to the original input quantity, ELU refers to the exponential linear cell activation function,
Figure BDA0003374398960000078
is the middle layer, LayerNorm is the standard normalization layer, ω is an index used to represent the shared weight. GTU is a gated linear unit, tanh is a tangent function, σ () is a Sigmoid activation function,
Figure BDA0003374398960000081
are the weight and the offset,
Figure BDA0003374398960000082
is an element-level product, dmodelIs the hidden layer size.
5. A three-layer decoder structure is constructed based on a gated residual error network and sparse attention, the middle layer calculates time characteristic sequence data attention by applying sparse attention, the upper layer and the lower layer use the gated residual error network, the upper layer mainly carries out information concentration on static data, and the lower layer carries out nonlinear processing on the output of the attention layer, so that model output is simplified.
Specifically, a static variable and a time variable are first accepted at an upper static information processing layer. Wherein the time variable is composed of encoder output and gated selective output weighted normalization
Figure BDA0003374398960000083
Sending into a historical time variable encoder
Figure BDA0003374398960000084
Sent to the future time variable encoder. Then, a set of uniform temporal characteristics is generated, which is used as input-by-decoder itself
Figure BDA0003374398960000085
And if so, the upper static information processing layer is represented as theta (t, n):
Figure BDA0003374398960000086
Figure BDA0003374398960000087
where s is a static variable, where,
Figure BDA0003374398960000088
representing the encoder variable, x~tRepresenting the data after processing by the gating mechanism and n representing the location.
Then, sparse attention calculation is carried out, and all upper static information processing layer inputs are processed
ζ(t)=[Θ(t,-k),...,Θ(t,-τ)]T (15)
Attention was calculated as: d (t) SparseMultiHead (Θ (t), Θ (t), Θ (t)) (16)
And finally, carrying out nonlinear processing on the output of the sparse attention layer by adopting GRN (glass-fiber network) and carrying out weighted normalization with the input of a time fusion decoder to obtain predicted values phi to (t, n):
Φ(t,n)=GRNφ(D(t,n)) (7)
Figure BDA0003374398960000089
fig. 4 is a schematic diagram illustrating the recognition of test data after the model training is completed in the embodiment of the present invention.
The power consumption model adopts millions of pieces of power consumption data, date, duration, position, industry, temperature and other data of nearly 4 years as training data, and divides a data set according to the proportion of 6:2:2, wherein 60% of the data set is used as a training set, and 20% of the data set is used as a verification set and a test set. In order to avoid uneven distribution of user data, a training set, a verification set and a test set of the experiment all comprise all users, proportional interval division is carried out by the number of days away from the starting date, a power consumption prediction model is subjected to multi-round training, and loss error, a real power consumption and predicted power consumption fitting curve, prediction accuracy and a box diagram are selected as evaluation standards of prediction effects. The loss error is obtained by calculating the difference value between the real power consumption and the predicted value and taking the average value, the accuracy is obtained by calculating the ratio of the predicted correct value to all the predicted data, and the box diagram shows the distribution of the error data from the minimum value, the lower quartile, the middle number, the upper quartile and the maximum value. As can be seen from fig. 4, the coincidence degree of the curves of the predicted power consumption and the actual power consumption is high, and the prediction accuracy is 91.1%, which indicates that the prediction accuracy is high; the error curve graph shows that the error fluctuation is stable between 0 and 0.4, and the average error is within, which indicates that the prediction effect of the model is stable, thereby achieving better data cleaning effect.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device, the apparatus and the computer-readable storage medium disclosed in the embodiments correspond to the method disclosed in the embodiments, so that the description is simple, and the relevant points can be referred to the description of the method.
The principle and the implementation of the present invention are explained by applying specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (1)

1. A user electricity consumption prediction method based on a transformer model is characterized by comprising the following steps:
inputting layer multivariable input in the step (1), and specifically comprising the following steps: the method comprises the steps that a plurality of types of variables are used as input in an input layer, and input power consumption time sequence data are divided into three types, namely static variables, past known input and input which can be speculated in the future, wherein the static variables comprise region variables and industry variables, and the part of data is irrelevant to time; past known time series inputs, belonging to dynamic time-varying variables, including power usage, load and temperature; the future known time series input belongs to dynamic time-invariant variables, including variables such as weekends, holidays and the like.
Step (2) using a GRN gating mechanism to perform weight calculation on the input variables, and specifically comprising the following steps: different independent gating mechanisms are used for all static, past and future inputs, and the flattening vectors of all historical input variables at the t moment are calculated
Figure FDA0003374398950000011
Different weights are given to variable data of the model input data according to the contribution degree by adopting a gated residual error module GRN, and the weight calculation is shown in the following formula:
Figure FDA0003374398950000014
and (3) performing feature extraction by adopting sparse attention, and specifically comprising the following steps of: based on the conventional self-attention scaling dot-product operation, the dot-product operation is performed on the triple input (query, key, value), and the attention score is generated as follows:
Figure FDA0003374398950000012
based on the assumption that a higher score indicates a higher correlation, we evaluate the value of the score P. Supposing that k scores before ranking are selected as effective scores to obtain queryiAnd keyjThe set P is sorted, the score of k before ranking is preserved, otherwise the score is set to be infinitesimal, as shown below:
Figure FDA0003374398950000013
step (4), the gated residual error module dynamically processes data information, and specifically comprises the following steps: in order to acquire variable information, a gated residual GRN module is constructed by adopting a gated linear unit GTU and standardized normalization processing, and data information is dynamically processed by the following formula:
GRNω(x)=LayerNorm(x+GTUω(θ)) (4)
θ=ELU(xW’ω+a) (5)
Figure FDA0003374398950000021
step (5) constructing a three-layer decoder by using a gated residual module and sparse attention, and specifically comprising: the method comprises the following steps of constructing a three-layer decoder structure by using a gated residual error network and sparse attention, calculating time characteristic sequence data attention by using sparse attention in a middle layer, using the gated residual error network in an upper layer and a lower layer, carrying out information concentration on static data in the upper layer, carrying out nonlinear processing on the output of the attention layer by using the lower layer, and simplifying to obtain model outputs phi to (t, n):
Φ(t,n)=GRNφ(D(t,n)) (7)
Figure FDA0003374398950000022
CN202111411790.9A 2021-11-25 2021-11-25 User electricity consumption prediction method based on transformer model Active CN114154700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111411790.9A CN114154700B (en) 2021-11-25 2021-11-25 User electricity consumption prediction method based on transformer model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111411790.9A CN114154700B (en) 2021-11-25 2021-11-25 User electricity consumption prediction method based on transformer model

Publications (2)

Publication Number Publication Date
CN114154700A true CN114154700A (en) 2022-03-08
CN114154700B CN114154700B (en) 2024-05-03

Family

ID=80457466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111411790.9A Active CN114154700B (en) 2021-11-25 2021-11-25 User electricity consumption prediction method based on transformer model

Country Status (1)

Country Link
CN (1) CN114154700B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049169A (en) * 2022-08-16 2022-09-13 国网湖北省电力有限公司信息通信公司 Regional power consumption prediction method, system and medium based on combination of frequency domain and spatial domain
CN115456144A (en) * 2022-08-25 2022-12-09 湖南大学 Prediction model training method and device
CN115860281A (en) * 2023-02-27 2023-03-28 之江实验室 Energy system multi-entity load prediction method and device based on cross-entity attention

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010279160A (en) * 2009-05-28 2010-12-09 Chugoku Electric Power Co Inc:The Power-load adjusting system, power-load adjusting device, watthour meter, and power-load adjusting method
CN110705692A (en) * 2019-09-25 2020-01-17 中南大学 Method for predicting product quality of industrial nonlinear dynamic process by long-short term memory network based on space and time attention
CN111598357A (en) * 2020-05-29 2020-08-28 江苏蔚能科技有限公司 Monthly power consumption prediction method based on capacity utilization hours and Gaussian distribution
US20210049228A1 (en) * 2014-09-22 2021-02-18 Sureshchandra B. Patel Methods of Patel Loadflow Computation for Electrical Power System

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010279160A (en) * 2009-05-28 2010-12-09 Chugoku Electric Power Co Inc:The Power-load adjusting system, power-load adjusting device, watthour meter, and power-load adjusting method
US20210049228A1 (en) * 2014-09-22 2021-02-18 Sureshchandra B. Patel Methods of Patel Loadflow Computation for Electrical Power System
CN110705692A (en) * 2019-09-25 2020-01-17 中南大学 Method for predicting product quality of industrial nonlinear dynamic process by long-short term memory network based on space and time attention
CN111598357A (en) * 2020-05-29 2020-08-28 江苏蔚能科技有限公司 Monthly power consumption prediction method based on capacity utilization hours and Gaussian distribution

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049169A (en) * 2022-08-16 2022-09-13 国网湖北省电力有限公司信息通信公司 Regional power consumption prediction method, system and medium based on combination of frequency domain and spatial domain
CN115049169B (en) * 2022-08-16 2022-10-28 国网湖北省电力有限公司信息通信公司 Regional power consumption prediction method, system and medium based on combination of frequency domain and spatial domain
CN115456144A (en) * 2022-08-25 2022-12-09 湖南大学 Prediction model training method and device
CN115860281A (en) * 2023-02-27 2023-03-28 之江实验室 Energy system multi-entity load prediction method and device based on cross-entity attention

Also Published As

Publication number Publication date
CN114154700B (en) 2024-05-03

Similar Documents

Publication Publication Date Title
CN114154700A (en) User power consumption prediction method based on transformer model
CN108876054B (en) Short-term power load prediction method based on improved genetic algorithm optimization extreme learning machine
CN111861013B (en) Power load prediction method and device
CN117421687A (en) Method for monitoring running state of digital power ring main unit
CN114006370B (en) Power system transient stability analysis and evaluation method and system
CN115081717A (en) Rail transit passenger flow prediction method integrating attention mechanism and graph neural network
Zhou et al. An empirical analysis of carbon emission price in China
CN111191113B (en) Data resource demand prediction and adjustment method based on edge computing environment
CN115861671A (en) Double-layer self-adaptive clustering method considering load characteristics and adjustable potential
CN112016839A (en) Flood disaster prediction and early warning method based on QR-BC-ELM
CN115759458A (en) Load prediction method based on comprehensive energy data processing and multi-task deep learning
Samin-Al-Wasee et al. Time-series forecasting of ethereum price using long short-term memory (lstm) networks
Wu et al. A novel hybrid genetic algorithm and simulated annealing for feature selection and kernel optimization in support vector regression
CN114444811A (en) Aluminum electrolysis mixing data superheat degree prediction method based on attention mechanism
CN116599860B (en) Network traffic gray prediction method based on reinforcement learning
CN116937565A (en) Distributed photovoltaic power generation power prediction method, system, equipment and medium
CN116756575A (en) Non-invasive load decomposition method based on BGAIN-DD network
Cui et al. Short-time series load forecasting by seq2seq-lstm model
Xu et al. Short-term load forecasting of power system based on genetic algorithm improved BP neural network algorithm
Tan et al. Application of self-supervised learning in non-intrusive load monitoring
Huang et al. Application of Steady State Data Compressed Sensing Based on LSTM and RNN in Rural Power Grid
Li et al. Online Attention Enhanced Differential and Decomposed LSTM for Time Series Prediction
Chen et al. Short Term Power Load Combination Forecasting Method Based on Feature Extraction
Xu et al. Evaluation method of line loss in station area based on feature selection and GRU network
Chen et al. Endpoint Temperature Prediction of Molten Steel in VD Furnace Based on AdaBoost. RT-ELM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant