CN113779879A - Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model - Google Patents

Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model Download PDF

Info

Publication number
CN113779879A
CN113779879A CN202111039397.1A CN202111039397A CN113779879A CN 113779879 A CN113779879 A CN 113779879A CN 202111039397 A CN202111039397 A CN 202111039397A CN 113779879 A CN113779879 A CN 113779879A
Authority
CN
China
Prior art keywords
data
lstm
seq2seq
electricity
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111039397.1A
Other languages
Chinese (zh)
Inventor
丁转莲
朱一鸣
吴雨
胡炜鑫
孙登第
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202111039397.1A priority Critical patent/CN113779879A/en
Publication of CN113779879A publication Critical patent/CN113779879A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/04Power grid distribution networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for detecting medium-and-long-term electricity utilization abnormity based on an LSTM-seq2seq-attention model, which comprises a data collection step, a data preprocessing step, a neural network model construction step, a neural network model training step, an economic data estimation step, an electricity utilization abnormity comprehensive index d calculation step and an electricity utilization abnormity judgment step. According to historical data, the electricity utilization behavior characteristics of different users can be analyzed by combining the influence factors including GDP, climate, holidays and the like. By utilizing the Seq2Seq-Attention neural network, the user data can be quickly and effectively analyzed, suspicious users can be detected, and electricity stealing prevention is implemented. The method for detecting the medium-and-long-term electricity utilization abnormity based on the LSTM-seq2seq-attention model has the advantages of rapidness, accuracy, high accuracy, good robustness and the like.

Description

Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model
Technical Field
The invention relates to the technical field of distance measurement, in particular to a method for detecting medium-and-long-term electricity utilization abnormity based on an LSTM-seq2seq-attention model.
Background
With the development of the society at present, electric energy becomes an indispensable important energy in production and life, and becomes the foundation of modern economic development. With the continuous development of the electric power industry in China, the electric quantity is also sufficiently improved. However, in the daily electricity utilization process, the electricity stealing and leakage behaviors exist all the time. The electricity stealing and leaking behavior not only wastes national resources, but also causes a great deal of potential safety hazards, threatens the personal safety of residents, and becomes one of important factors hindering the development of social economy.
At present, the electricity stealing technology is endlessly developed, electricity stealing means are gradually specialized and highly-technologized, and even a set of complete industrial chain exists. With the attendant greater difficulty in preventing electricity theft. At present, the electricity stealing detection mainly comprises manual work to check on site, hardware equipment through electromagnetic interference prevention, a software real-time monitoring system and the like. Firstly, the manual checking mode has the defects of large manpower and material resources, high labor intensity and workload, and missed checking in the manual checking process. Secondly, most of the electricity larceny prevention equipment or devices on the market have the defects of high cost, inconvenience in moving and the like. And thirdly, the software detects that there is misjudgment due to missing judgment, and the detection methods and devices greatly increase investment and operation cost, so that the cost performance is not high.
Current research is mainly focused on solving the performance issues of abnormal electricity usage data detection. Along with the rapid growth of power consumer data and electric equipment, the dimension and data volume of the user electricity consumption data are also rapidly increased, so that the problem of low performance of the conventional electricity consumption data anomaly detection algorithm is caused. The invention patent of China with the application number of 201910389132.0 discloses that when an electricity consumption data abnormity detection model is trained, an LSTM network is applied to analyze data association information of historical electricity consumption data based on historical electricity consumption data to perform dimensionality reduction processing on the historical electricity consumption data, then the electricity consumption data abnormity detection model is trained, so that an electricity consumption data abnormity detection model which is suitable for time association characteristics and high-dimensional characteristics of the electricity consumption data is obtained, and the electricity consumption data abnormity detection model is applied to detect input electricity consumption data to be detected to obtain a detection result. However, this method has problems such as mutual influence between different kinds of data and low estimation accuracy.
Disclosure of Invention
The invention aims to solve the technical problem of providing a medium-and-long-term electricity utilization abnormity detection method based on an LSTM-seq2seq-attention model, which has good robustness and high estimation accuracy.
In order to solve the technical problems, the invention adopts the following technical scheme.
A middle-long term electricity utilization abnormity detection method based on an LSTM-seq2seq-attention model comprises the following steps:
step 1: a data collection step; collecting power utilization data in a preset time period according to the preset time period;
step 2: a data preprocessing step; carrying out data cleaning, missing value supplementing and normalization processing on the collected power utilization data;
and step 3: a neural network model construction step; constructing a multilayer LSTM-seq2seq-attention neural network by taking the LSTM neural network as a neuron;
and 4, step 4: training a neural network model; training the LSTM-seq2seq-attention neural network in the step 3 by using the power utilization data preprocessed in the step 2 to obtain power data under the normal power utilization condition;
and 5: estimating economic data; taking the power data under the normal power utilization condition in the step 4 as input, and calculating and obtaining an estimated value of economic data under the normal power utilization condition through a principal component analysis method;
step 6: calculating a comprehensive index d of the power utilization abnormality; calculating a comprehensive index d of abnormal power utilization according to the estimated value of the economic data under the condition of normal power utilization in the step 5;
and 7: judging the abnormal electricity utilization; and judging whether the electricity utilization is abnormal or not by comparing the threshold sigma with the comprehensive index d of the electricity utilization abnormality according to a preset threshold sigma.
In step 1, the time period is 48 months before the current month is detected.
The electricity consumption data comprises electricity load data, economic data GDP and meteorological data.
The meteorological data comprises rainfall, air temperature, humidity data, wind speed and air pressure.
The step 1 further comprises collecting electricity consumption data for detecting the current month.
In the step 3, the multilayer LSTM-seq2 seq-anchorage neural network comprises an encoder and a decoder, and an attention mechanism is introduced.
In the step 4, parameters of the model are optimized by adopting an Adam optimization algorithm in the training process.
In the step 6, calculating an abnormal power utilization comprehensive detection value d of the user by adopting a formula (11);
d=|h-s|/h*100% (11)
in the formula (11), h is the estimated value of the monthly average GDP, s is the monthly average value to be detected, and the value to be detected is the monthly average economic data GDP of the enterprise in the current month.
In the step 7, whether the user is in a suspected state of no electricity stealing, a suspected state of electricity stealing exists or a suspected user needs to alarm is judged by comparing the threshold value sigma with the electricity utilization abnormality comprehensive index d.
The invention has the beneficial effects that:
the invention discloses a method for detecting medium-and-long-term electricity utilization abnormity based on an LSTM-seq2seq-attention model, which comprises a data collection step, a data preprocessing step, a neural network model construction step, a neural network model training step, an economic data estimation step, an electricity utilization abnormity comprehensive index d calculation step and an electricity utilization abnormity judgment step. According to historical data, the electricity utilization behavior characteristics of different users can be analyzed by combining the influence factors including GDP, climate, holidays and the like. By using the LSTM-seq2seq-attention neural network, the user data can be quickly and effectively analyzed, the suspicious user can be detected, and electricity larceny prevention can be implemented.
In recent years, the electricity utilization information acquisition system is gradually applied, and power enterprises have abundant historical user data. According to historical data, the electricity utilization behavior characteristics of different users can be analyzed by combining influence factors including economic data GDP, meteorological data, holidays and the like. By using the LSTM-seq2seq-attention neural network, the user data can be quickly and effectively analyzed, the suspicious user can be detected, and electricity larceny prevention can be implemented.
According to the invention, the LSTM is used as the seq2seq structure of the neural unit, an attention mechanism is added, the network weight can be better distributed, the Adam optimization algorithm is selected to optimize the model parameters, the calculation efficiency is improved, and the encoder of the seq2seq structure uses the multilayer LSTM to enhance the robustness and the estimation accuracy of the model. Meanwhile, a principal component analysis method is selected, so that the mutual influence among evaluation indexes is eliminated, the workload is reduced, and the calculation overhead of the algorithm is reduced.
And the electric anomaly detection is completed by utilizing seq2seq-attention and a principal component analysis dual model, so that the detection accuracy and robustness are improved. Compared with the original method, the neural network model can judge electricity stealing more quickly and accurately.
The method for detecting the medium-and-long-term electricity utilization abnormity based on the LSTM-seq2seq-attention model has the advantages of rapidness, accuracy, high accuracy, good robustness and the like.
Drawings
FIG. 1 is a flow chart of a method for detecting medium-and long-term electricity utilization anomaly based on an LSTM-seq2seq-attention model.
FIG. 2 is an LSTM structure diagram of the method for detecting the medium-and long-term electricity consumption abnormality based on the LSTM-seq2seq-attention model.
FIG. 3 is a structural diagram of an LSTM-seq2seq-attention neural network of the method for detecting the medium-and long-term electricity utilization anomaly based on the LSTM-seq2seq-attention model.
FIG. 4 is the original power load data of the method for detecting the medium-and long-term power consumption abnormality based on the LSTM-seq2seq-attention model.
FIG. 5 is the power load data after the data preprocessing of the method for detecting the medium-and long-term electricity utilization anomaly based on the LSTM-seq2seq-attention model.
FIG. 6 shows the influence factors of the raw data of the method for detecting the medium-and long-term power consumption abnormality based on the LSTM-seq2seq-attention model.
FIG. 7 is influence factor data after data preprocessing of the method for detecting the medium-and long-term electricity utilization anomaly based on the LSTM-seq2seq-attention model.
FIG. 8 is the power load estimation value of the method for detecting the medium-and long-term power consumption abnormality based on the LSTM-seq2seq-attention model.
FIG. 9 is a comparison graph of the GDP to-be-detected value and the estimated value of the medium-and-long-term electricity utilization anomaly detection method based on the LSTM-seq2seq-attention model.
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.
Referring to fig. 1-9, the method for detecting the medium-and long-term electricity consumption anomaly based on the LSTM-seq2seq-attention model of the present invention includes the following steps:
step 1: a data collection step; collecting power utilization data in a preset time period according to the preset time period;
if the power utilization behavior of an enterprise is abnormal in the detection month, power utilization data of 48 months before the detection month needs to be collected. The electricity consumption data comprises electricity load data, economic data GDP and meteorological data. The meteorological data comprises rainfall, air temperature, humidity data, wind speed and air pressure. And collecting the electricity load data, economic data GDP, rainfall, air temperature, humidity data, wind speed, average air pressure and the like of the current month of detection.
Step 2: a data preprocessing step; carrying out data cleaning, missing value supplementing and normalization processing on the collected power utilization data;
the collected data may always have missing data or redundant chaotic data, and no matter the data is missing data or redundant data mixed, errors are caused to final estimation and analysis results, data cleaning needs to be performed on the data, missing values are filled, dimension differences are eliminated through data formatting, and finally normalization is performed. In the data preprocessing process, data cleaning is firstly carried out, and repeated data and incomplete data in a data set are deleted. Then, the missing value is filled up by adopting a sliding average window method for the historical data. And finally, carrying out Min-max normalization on the supplemented data set.
And step 3: a neural network model construction step; constructing a multilayer LSTM-seq2seq-attention neural network by taking the LSTM neural network as a neuron;
in the process of constructing the seq2seq-attention neural network, the LSTM neural network is used as a neuron to construct a multilayer seq2seq neural network LSTM-seq2seq-attention, an attention mechanism is added, and a Mish activation function is used as an output layer activation function of the whole neural network, so that the effect of reducing gradient explosion is achieved, and the stability of model training is improved.
And 4, step 4: training a neural network model; training the LSTM-seq2seq-attention neural network in the step 3 by using the power utilization data preprocessed in the step 2 to obtain power data under the normal power utilization condition;
and training the normalized training set by using an LSTM-seq2seq-attention neural network, substituting the set to be detected after normalization into the trained LSTM-seq2seq-attention neural network model for estimation, and obtaining power data under the condition of normal power consumption.
And taking the acquired electricity consumption data of 48 months before the detection month as training data, taking the collected electricity consumption data of the current month of the detection month as data to be detected, and normalizing the training data and the data to be detected by using a Min-max method to be within a range of [0,1 ].
And 5: estimating economic data; taking the power data under the normal power utilization condition in the step 4 as input, and calculating and obtaining an estimated value of economic data under the normal power utilization condition through a principal component analysis method;
step 6: calculating a comprehensive index d of the power utilization abnormality; calculating a comprehensive index d of abnormal power utilization according to the estimated value of the economic data under the condition of normal power utilization in the step 5;
and 4, taking the power data under the normal power utilization condition in the step 4 as input, and calculating and obtaining an estimated value of the economic data under the normal power utilization condition through a principal component analysis method.
And 7: judging the abnormal electricity utilization; and judging whether the electricity utilization is abnormal or not by comparing the threshold sigma with the comprehensive index d of the electricity utilization abnormality according to a preset threshold sigma.
And setting a threshold sigma, subtracting the economic data to be detected and the estimated economic data, comparing the threshold with the difference, and judging the abnormal behavior of the electricity consumption. When the power consumption abnormal behavior is detected, the power consumption abnormal behavior degree of the user is judged by setting a threshold value and utilizing the final GDP estimated value and the power consumption abnormal comprehensive index d obtained by the value to be detected.
In step 1, the time period is 48 months before the current month is detected.
The step 1 further comprises collecting electricity consumption data for detecting the current month.
In the step 3, the multilayer LSTM-seq2 seq-anchorage neural network comprises an encoder and a decoder, and an attention mechanism is introduced.
The LSTM-seq2 seq-anchorage neural network is mainly composed of an encoder and a decoder, and introduces an attention mechanism. The encoder is composed of multiple layers of LSTM (Long Short-Term Memory), encodes input data and outputs the encoded state; the attention mechanism is arranged between an encoder and a decoder, the decoder consists of a single layer LSTM, the output of the attention mechanism part is spliced as a context vector with the output of the encoder to serve as the input of the decoder, and the output value of each step serves as the input value of the next step.
In the step 4, parameters of the model are optimized by adopting an Adam optimization algorithm in the training process.
In the step 6, calculating an abnormal power utilization comprehensive detection value d of the user by adopting a formula (11);
d=|h-s|/h*100% (11)
in the formula (11), h is the estimated value of the monthly average GDP, s is the monthly average value to be detected, and the value to be detected is the monthly average economic data GDP of the enterprise in the current month.
In the step 7, whether the user is in a suspected state of no electricity stealing, a suspected state of electricity stealing exists or a suspected user needs to alarm is judged by comparing the threshold value sigma with the electricity utilization abnormality comprehensive index d.
The invention relates to a medium-and-long-term electricity utilization abnormity detection method based on an LSTM-seq2seq-attention model, which mainly comprises the following 7 steps.
Step 1: a data collection step; collecting power utilization data in a preset time period according to the preset time period;
if an enterprise needs to judge whether abnormal power utilization behaviors exist in a detection month, power utilization data 48 months before the detection month needs to be collected, and the power utilization data are acquired according to days. The daily electricity consumption data comprises daily electricity load data, daily economic data GDP and daily meteorological data. The daily meteorological data comprise daily average rainfall, average air temperature, average humidity, average wind speed, average air pressure and the like.
Acquiring daily power load data of a power supply company through a regional power supply company where an enterprise is located; acquiring daily economic data GDP of the company through a statistical bureau of the area where the enterprise is located; and acquiring daily average rainfall, daily average air temperature, daily average humidity, daily average wind speed and daily average air pressure of the location of the company through a meteorological office of the area where the company is located. Meanwhile, the electricity utilization data of each day in the current month of the month needs to be collected, and the electricity utilization data comprises electricity load data, economic data GDP, average rainfall, average air temperature, average humidity, average wind speed and average air pressure of each day.
Step 2: a data preprocessing step; carrying out data cleaning, missing value supplementing and normalization processing on the collected power utilization data;
firstly, data cleaning is carried out, and repeated data and incomplete data in a data set are deleted. And then, missing values of the historical data are filled, a difference is filled by adopting a sliding average window method, and if the ith position data in one list a is missing data, the average value of the previous and next window data is taken as interpolation data.
For example: list a ═ 1,2,4,2, None,6,3,2,1], where None is missing data, window is selected to be 3;
the data for None location is: (2+4+2+6+3+2)/6 ═ 3.15. That is, the interpolation data of the data is an average value of 6 data of the first 3 and last 3 data.
Dimension difference exists among all input data, the difference between the magnitude and the quantization unit is large, and in order to eliminate the influence of the difference among the data dimensions on model training and estimation, a Min-max method is used for standardizing the input data so as to accelerate the convergence speed of the model and improve the precision of the model. The normalized range is [1,0], and the expression is as shown in the following formula (1).
Figure BDA0003248519110000071
In the formula (1), x _ min represents the minimum value in the x sequence, x _ max represents the maximum value in the x sequence, x represents the value in the x sequence, x _1 to x _ n, y represents the newly generated sequence, and y _1 to y _ n.
And taking the acquired data of 48 months before the detection month as training data, and taking the collected data of the current month of the detection month as data to be detected.
Defining an input sequence format xl={Dl,Tl,Rl,Gl,Sl,Fl,Pl}。
Wherein DlIs daily power load data, TlIs the daily average temperature, RlIs the daily average rainfall, GlIs the GDP production value, S, of the company per daylIs the average daily humidity, FlIs the daily average wind speed, PlIs the daily average air pressure.
And respectively carrying out data preprocessing on the power load data and the influence factors thereof according to the steps. The comparison effect before and after the data preprocessing of the power load data is shown in fig. 1 and fig. 2. Influence factors the effect of the comparison before and after data preprocessing is shown in fig. 4 and 5. As can be seen from the comparison between fig. 4 and 5, in the data preprocessing process, the numerical anomaly points are removed, the quality of the historical data is improved, and preparation is made for improving the accuracy of the estimated value.
And step 3: a neural network model construction step; constructing a multilayer LSTM-seq2seq-attention neural network by taking the LSTM neural network as a neuron;
the GDP history data G of day 1 is taken from the power load history data of day 11Daily power load history data D1Daily average temperature history data T1Daily average rainfall historical data R1Daily average humidity history data S1Daily average wind speed historical data F1Daily average air pressure historical data P1Input x for layer 1 of the LSTM-seq2seq-attention neural network1={D1,T1,R1,G1,S1,F1,P1}. The hidden state of the current time is determined by the input of the previous time and the current time, i.e. h1=f(h0,x1). The historical data G of the GDP of the day is taken until the historical data of the power load of the daylDaily power load history data DlDaily average temperature history data TlDaily average rainfall historical data RlDaily average humidity historical data SlDaily average humidity historical data FlDaily average humidity history data. The input to the l-th layer of the Seq2Seq-Attention neural network is xl. Wherein xl={Dl,Tl,Rl,Gl,Sl,Fl,PlThe hidden state of the current time is determined by the input of the previous time and the current time, namely hl=f(hl-1,xl). Where h is the hidden state, i.e. a vector, the hidden layer state in LSTM, the hidden state at the current time is determined by the input of the last time and the current time together, and the constant f can be self-fetched depending on whether 100% or 50% of the last time is determined by the constant f.
The multi-layer LSTM-seq2 seq-anchorage neural network comprises a coding and decoding part and a decoder part.
The coding part obtains the output of each hidden layer and then summarizes the output to generate a semantic vector C: q (h)1,h2,...,hl). Wherein q is a constant for controlling the output summary size of the hidden layer, and the value can beSelf-taking; the parameter q is convenient to calculate and meets the requirement of magnitude order in calculation.
The decoding part outputs a sequence based on a given semantic vector C and the output sequence
Figure BDA0003248519110000081
To estimate the next output
Figure BDA0003248519110000082
Namely, it is
Figure BDA0003248519110000083
Where g () represents a non-linear activation function,
Figure BDA0003248519110000084
representing the output corresponding to input x.
The multi-layer LSTM-seq2seq-attention neural network also introduces a mechanism of attention.
The encoding part obtains respective hidden vectors h1,h2,...,hlAdding the values according to the weight to generate a semantic vector c when l is ii. Wherein the content of the first and second substances,
Figure BDA0003248519110000085
αijis a weight value.
Defining conditional probabilities, weight values alpha, in the decoding partijOutput hidden state s from the i-1 sti-1Determined in common with each hidden state in the input, i.e.
Figure BDA0003248519110000086
eij=a(si-1,hj). Wherein eijIs the hidden layer state h of the Encoder Encode at the time jjFor i time hidden layer state s in DecoderiThe degree of influence of (c). By softmax function
Figure BDA0003248519110000087
Will influence the degree eijNormalization of probability to alphaijWeighted value alphaijThe higher the value of (A), the higher the expressionThe more attention that is allocated on the jth input at the ith output, the more affected the jth input is in generating the ith output. From this, the hidden state of the next layer of the Decoder is calculated
Figure BDA0003248519110000088
(hidden state at decoder i), and output of that location
Figure BDA0003248519110000089
The invention also adopts a principal component analysis method to process data.
And 4, step 4: training a neural network model; training the LSTM-seq2seq-attention neural network in the step 3 by using the power utilization data preprocessed in the step 2 to obtain power data under the normal power utilization condition;
the data of the training model is the data of the power load from 2016, 1 and 6, 30 in 2020 of a company in California and the data of the influence factors of the power load.
And selecting a seq2seq model based on the LSTM, wherein a seq2seq structure encoder and decoder are formed. The encoder end has the advantage of flexible data receiving, the output of the previous step in the decoder can be used as the data input decoder of the next step, the time sequence relation among data can be better learned based on the characteristic, the attention mechanism is added to optimize weight distribution, and meanwhile the problem of abnormal power utilization behavior is solved by using the excellent data mining capability of the LSTM. The network structure of the LSTM-seq2 seq-anchorage neural network is shown in FIG. 2, and the relevant parameters of the network are expressed as follows.
Output ht:ht=ot*tanh(ct)
Candidate states:
Figure BDA0003248519110000091
input door it:it=σ(Wi*Ct-1+Wi*ht-1+Wi*xt+bi);
Forget door ft:ft=σ(Wi*Ct-1+Wi*ht-1+Wi*xt+bf);
Cell State ct
Figure BDA0003248519110000092
Output gate ot:ot=σ(Wi*Ct-1+Wi*ht-1+Wi*xt+bo)。
The inputs to the LSTM-seq2 seq-anchorage neural network are: c. Ct-1、xt、ht-1The output of the LSTM-seq2 seq-anchorage neural network is: c. Ct、ht
Forget door ft: and selectively forgetting information in the cell state of the last step, and realizing forgetting gate through a sigmoid layer. H of the above stept-1And x of this steptAs input, then ct-1Each digit in the array outputs a value between 0 and 1, denoted as ftIt indicates how much information is retained, 1 indicates complete retention, and 0 indicates complete discard.
Input door it: determine what is present in the cell state and selectively record new information into the cell state. The sigmoid layer (input gate layer) decides what value we want to update, this probability is denoted as it. the tanh layer creates a candidate value vector
Figure BDA0003248519110000093
Will be added to the cellular state. Where W is a weight matrix.
Output gate ot: determining the output local cell state c by sigmoid layer (output layer gate)tThen, the cell state is passed through a tanh layer (the value is between-1 and 1), and then the cell state is multiplied by the output of the sigmoid layer to obtain a final output ht
Wherein, boAnd bc、bi、bfSame is the corresponding door offsetAnd setting parameters. σ is a sigmiod function, which can be understood as a threshold.
Sigmiod function expression:
Figure BDA0003248519110000101
tanh function expression:
Figure BDA0003248519110000102
and finally, the Mish activation function is used in the output layer of the model, and the problems of gradient extinction and gradient explosion are further avoided by utilizing the characteristic that the activation function is very smooth in an interval.
The functional expression of the Mish function is shown in the following formula (4).
Mish=x*tanh(Ln(1+ex)) (4)
The minimization objective function is:
Figure BDA0003248519110000103
and (5) performing parameter optimization on the model by using an Adam optimization algorithm, wherein the expressions are as shown in the following formulas (5) to (10).
Figure BDA0003248519110000104
mt=β1·mt-1+(1-β1)·gt (6)
vt=β2·vt-1+(1-β2)·gt 2 (7)
Figure BDA0003248519110000105
Figure BDA0003248519110000106
Figure BDA0003248519110000107
Delta is the learning rate or step size, alphaijIs a weight value, beta1Is the exponential decay Rate, beta, of the first moment estimate2Exponential decay Rate of second moment estimate, mtFor biased first order moment estimation, vtThere is a biased second-order moment estimate,
Figure BDA0003248519110000108
the modified biased first moment estimate is then used,
Figure BDA0003248519110000109
corrected biased second moment estimate, θtTo update the parameters, t represents time.
The expressions (6) and (7) are first order moment estimation and second order moment estimation of the gradient, respectively, and can be regarded as expected E | gtI and E | gtAnd (5) estimating | the. Equations (8) and (9) are corrections to the first order second moment estimate, which can be approximated as an unbiased estimate of the expectation.
The Adam optimization algorithm has the advantages of high calculation efficiency, suitability for large-scale data operation and parameter optimization, low memory requirement and the like, and can update the network weight more effectively compared with a classical random gradient descent algorithm.
The structure of the LSTM-seq2 seq-anchorage neural network is shown in FIG. 3.
As shown in FIG. 3, the LSTM-seq2 seq-anchorage neural network is composed of an encoder and a decoder, and introduces an attention mechanism.
The encoder is composed of multiple layers of LSTM, encodes input data, and outputs the encoded state. The attention mechanism is in between the encoder and decoder.
The decoder is composed of a single layer LSTM, the output of the attention mechanism part is used as a context vector and spliced with the output of the encoder to be used as the input of the decoder, and the output value of each step is used as the input value of the next step.
And training the normalized training set by using an LSTM-seq2seq-attention neural network, and substituting the data to be detected after normalization into the trained LSTM-seq2seq-attention neural network model for estimation to obtain power data under the condition of normal power consumption.
And 5: estimating economic data; taking the power data under the normal power utilization condition in the step 4 as input, and calculating and obtaining an estimated value of economic data under the normal power utilization condition through a principal component analysis method;
acquiring GDP data G of each day of the month to be detected0Daily average temperature data T0Daily average rainfall data R0Daily average humidity data S0Daily average wind speed data F0Daily average air pressure data P0And introducing the data to be detected into a trained LSTM-seq2seq-attention neural network to obtain power load estimation data of the energy storage system in the current month
Figure BDA0003248519110000111
The data for the test month is data from 1/7/2020 to 31/7/2020 by a company, California. The power load estimation value is shown in fig. 8.
The obtained current month power estimation data
Figure BDA0003248519110000112
And (4) performing principal component analysis by combining the data such as air temperature data, rainfall data, humidity data, wind speed data, air pressure data and the like in the same period to obtain a GDP value in the same period.
Determine 6 indices affecting GDP: collecting the index values of one month by using electricity load data, air temperature, rainfall, humidity, wind speed and air pressure, wherein 6 indexes of each month are a1,a2,a3,a4,a5,a6Then, a matrix of order l x 6 can be obtained. The original variable index is recorded as a1,a2,a3,a4,a5,a6Their composite index (new variable index) is b1,b2,...,blThe new index is composed of the original index a1,a2,a3,a4,a5,a6And (4) linear representation.
Observing a sample matrix
Figure BDA0003248519110000121
The sample matrix a is expressed as the following equation (2) after normalization. Indexes mentioned therein, such as a1, are power load data, but are written as a11 to aL1 (i.e., the first column of a)) in a two-dimensional array.
Figure BDA0003248519110000122
A correlation coefficient matrix R of the samples is calculated, R ═ corrcoef (x). Wherein the correlation coefficient matrix R is
Figure BDA0003248519110000123
rij(i-1, 2.., l; j-1, 2, 3, 4, 5, 6) is the original variable aiAnd ajOf correlation coefficient rijThe calculation formula is shown in the following formula (3).
Figure BDA0003248519110000124
In the formula (3), the reaction mixture is,
Figure BDA0003248519110000125
and
Figure BDA0003248519110000126
is the average of the ith and jth indices.
Corresponding to the correlation coefficient matrix R, 6 nonnegative eigenvalues of the eigen equation are solved by a Jacobian method, and corresponding to the 6 nonnegative eigenvalues lambda1~λ6:λ1>λ2>λ3>λ4>λ5>λ6>0。
Selecting 3 principal components, and if the ratio of the variance of the first 3 principal components to the total variance is close to 1, selecting the first 3 factors as the 1 st principal component and the 2 nd factorPrincipal component, 3 rd principal component. Thus, the number of factors is reduced from 6 to 3, and the screening factor is played. Selecting the components with V more than 85 percent, wherein
Figure BDA0003248519110000127
The load estimation error based on principal component analysis is generally smaller than the load estimation error without principal component analysis, the number of the influencing factors is reduced from 6 to 3, the number of the influencing factors needing to be calculated is reduced under the condition of keeping original information, the calculated amount is reduced, and the estimation accuracy is improved.
And (4) carrying out principal component analysis, and selecting 3 principal components with the highest influence degrees to respectively carry out calculation for air temperature, rainfall and wind speed. The contemporaneous GDP values were obtained. The estimated values of the GDP data are then compared graphically with the detected values. FIG. 9 is a comparison graph of the GDP value to be detected and the estimated value.
Step 6: calculating a comprehensive index d of the power utilization abnormality; calculating a comprehensive index d of abnormal power utilization according to the estimated value of the economic data under the condition of normal power utilization in the step 5;
and 7: judging the abnormal electricity utilization; and judging whether the electricity utilization is abnormal or not by comparing the threshold sigma with the comprehensive index d of the electricity utilization abnormality according to a preset threshold sigma.
In order to realize abnormal electricity detection, current historical data of a user is required to be acquired. And acquiring historical economic data of a time period s after the current monitoring time is taken as a starting time by the selected area statistical office.
And (3) calculating the abnormal power utilization abnormal comprehensive detection value d of the user by adopting a formula (11).
d=|h-s|/h*100% (11)
In the formula (11), h is the estimated value of the monthly average GDP, s is the monthly average value to be detected, and the value to be detected is the monthly average economic data GDP of the enterprise in the current month.
The threshold σ is set, and the specific determination rule is as follows.
If d is less than sigma, the user is suspicion of electricity stealing, and the current monitored month electricity quantity data, GDP data, air temperature data, rainfall data, air temperature data, wind speed data and air pressure data are added to historical data and cover previous contemporaneous data, so that the accuracy of past data is guaranteed.
If the sigma < d <2 sigma, the user has suspicion of electricity stealing or has errors caused by sudden accidents such as power equipment failure, and the like, and the suspicious event is output and is preliminarily reported.
If d is larger than 2 sigma, the output is a suspicious user and an alarm is given.
In most cases, the model can be used as a judgment method for judging whether electricity is stolen or not, but cannot be used as a basis for confirming judgment. The method mainly aims to facilitate power supply enterprises to monitor the electricity stealing of the enterprises, improve the monitoring efficiency of the enterprises, reduce the workload and the detection cost, and detect whether the electricity stealing is more rigorous. When a suspicious user appears, follow-up investigation is needed to determine whether special conditions exist. For example: special holidays or external inefficacy.
According to the method for detecting the medium-long term electricity consumption abnormity based on the LSTM-seq2seq-attention model, a prediction model is built by using a seq2seq network structure, an encoder and a decoder of the seq2seq structure are formed by using LSTM neural network units, so that the learning power of the model learning data time sequence is enhanced, attention and mechanism are added into the model, the network weight configuration is optimized, in order to improve the prediction precision and better complete the prediction task, a plurality of external influence factors are taken into consideration, and the stability and the accuracy of the prediction model are enhanced.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (9)

1. A middle-long term electricity utilization abnormity detection method based on an LSTM-seq2seq-attention model is characterized by comprising the following steps:
step 1: a data collection step; collecting power utilization data in a preset time period according to the preset time period;
step 2: a data preprocessing step; carrying out data cleaning, missing value supplementing and normalization processing on the collected power utilization data;
and step 3: a neural network model construction step; constructing a multilayer LSTM-seq2seq-attention neural network by taking the LSTM neural network as a neuron;
and 4, step 4: training a neural network model; training the LSTM-seq2seq-attention neural network in the step 3 by using the power utilization data preprocessed in the step 2 to obtain power data under the normal power utilization condition;
and 5: estimating economic data; taking the power data under the normal power utilization condition in the step 4 as input, and calculating and obtaining an estimated value of economic data under the normal power utilization condition through a principal component analysis method;
step 6: calculating a comprehensive index d of the power utilization abnormality; calculating a comprehensive index d of abnormal power utilization according to the estimated value of the economic data under the condition of normal power utilization in the step 5;
and 7: judging the abnormal electricity utilization; and judging whether the electricity utilization is abnormal or not by comparing the threshold sigma with the comprehensive index d of the electricity utilization abnormality according to a preset threshold sigma.
2. The LSTM-seq2 seq-annotation model-based method for detecting the abnormal electricity utilization in the medium and long term as claimed in claim 1, wherein in the step 1, the time period is 48 months before the current month is detected.
3. The LSTM-seq2seq-attention model-based medium-and-long term electricity consumption anomaly detection method according to claim 1, wherein the electricity consumption data comprises electricity consumption load data, economic data GDP and meteorological data.
4. The LSTM-seq2seq-attention model-based medium and long term electricity anomaly detection method according to claim 3, wherein said meteorological data comprises rainfall, air temperature, humidity data, wind speed, air pressure.
5. The LSTM-seq2seq-attention model-based method for detecting the electricity consumption abnormality in the medium and long term according to claim 1, wherein the step 1 further comprises collecting electricity consumption data for detecting the current month.
6. The LSTM-seq2 seq-annotation model-based method for detecting electrical anomaly in long and medium term use according to claim 1, wherein in step 3, said multi-layer LSTM-seq2 seq-annotation neural network comprises an encoder and a decoder, and an attention mechanism is introduced.
7. The LSTM-seq2 seq-annotation model-based method for detecting the electricity consumption abnormality in the medium and long term according to claim 1, wherein in the step 4, the parameter of the model is optimized by using an Adam optimization algorithm in the training process.
8. The LSTM-seq2seq-attention model-based medium-and-long term electricity consumption anomaly detection method according to claim 1, wherein in the step 6, a user abnormal electricity consumption anomaly comprehensive detection value d is calculated by adopting a formula (11);
d=|h-s|/h*100% (11)
in the formula (11), h is the estimated value of the monthly average GDP, s is the monthly average value to be detected, and the value to be detected is the monthly average economic data GDP of the enterprise in the current month.
9. The LSTM-seq2seq-attention model-based medium-and-long-term electricity consumption abnormality detection method according to claim 1, wherein in the step 7, whether the user is in a suspected state of electricity stealing, a suspected state of electricity stealing exists or a suspected user needs to alarm is judged by comparing a threshold value σ with an electricity consumption abnormality comprehensive index d.
CN202111039397.1A 2021-09-06 2021-09-06 Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model Pending CN113779879A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111039397.1A CN113779879A (en) 2021-09-06 2021-09-06 Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111039397.1A CN113779879A (en) 2021-09-06 2021-09-06 Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model

Publications (1)

Publication Number Publication Date
CN113779879A true CN113779879A (en) 2021-12-10

Family

ID=78841082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111039397.1A Pending CN113779879A (en) 2021-09-06 2021-09-06 Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model

Country Status (1)

Country Link
CN (1) CN113779879A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114401135A (en) * 2022-01-14 2022-04-26 国网河北省电力有限公司电力科学研究院 Internal threat detection method based on LSTM-Attention user and entity behavior analysis technology
CN114418071A (en) * 2022-01-24 2022-04-29 中国光大银行股份有限公司 Cyclic neural network training method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190081476A1 (en) * 2017-09-12 2019-03-14 Sas Institute Inc. Electric power grid supply and load prediction
CN112163689A (en) * 2020-08-18 2021-01-01 国网浙江省电力有限公司绍兴供电公司 Short-term load quantile probability prediction method based on depth Attention-LSTM
CN112288137A (en) * 2020-10-09 2021-01-29 国网电力科学研究院有限公司 LSTM short-term load prediction method and device considering electricity price and Attention mechanism
CN112308402A (en) * 2020-10-29 2021-02-02 复旦大学 Power time series data abnormity detection method based on long and short term memory network
CN113139605A (en) * 2021-04-27 2021-07-20 武汉理工大学 Power load prediction method based on principal component analysis and LSTM neural network
CN117689229A (en) * 2023-12-14 2024-03-12 国网北京市电力公司 GDP data prediction method and prediction device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190081476A1 (en) * 2017-09-12 2019-03-14 Sas Institute Inc. Electric power grid supply and load prediction
CN112163689A (en) * 2020-08-18 2021-01-01 国网浙江省电力有限公司绍兴供电公司 Short-term load quantile probability prediction method based on depth Attention-LSTM
CN112288137A (en) * 2020-10-09 2021-01-29 国网电力科学研究院有限公司 LSTM short-term load prediction method and device considering electricity price and Attention mechanism
CN112308402A (en) * 2020-10-29 2021-02-02 复旦大学 Power time series data abnormity detection method based on long and short term memory network
CN113139605A (en) * 2021-04-27 2021-07-20 武汉理工大学 Power load prediction method based on principal component analysis and LSTM neural network
CN117689229A (en) * 2023-12-14 2024-03-12 国网北京市电力公司 GDP data prediction method and prediction device

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
XIN WU,ET AL: "Electricity Consumption and Weather Reflect Macro-Economic Status", 《2019 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES ASIA》 *
ZHIFENG LIN,ET AL: "Electricity Consumption Prediction Based on LSTM with Attention Mechanism", 《IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING》, vol. 15, no. 4 *
丁柏宏: "智能电网环境下的短期负荷预测研究", 《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》, no. 07 *
周克男;刘进波;: "基于主成分分析的BP神经网络预测电力负荷", 数学学习与研究, no. 23 *
陈素玲;姚建刚;龚磊;: "基于加权偏最小二乘回归的中长期负荷预测", 电力需求侧管理, no. 01 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114401135A (en) * 2022-01-14 2022-04-26 国网河北省电力有限公司电力科学研究院 Internal threat detection method based on LSTM-Attention user and entity behavior analysis technology
CN114418071A (en) * 2022-01-24 2022-04-29 中国光大银行股份有限公司 Cyclic neural network training method

Similar Documents

Publication Publication Date Title
CN108881196B (en) Semi-supervised intrusion detection method based on depth generation model
CN112990556A (en) User power consumption prediction method based on Prophet-LSTM model
CN111813084B (en) Mechanical equipment fault diagnosis method based on deep learning
CN111339712A (en) Method for predicting residual life of proton exchange membrane fuel cell
CN113779879A (en) Medium-and-long-term electricity utilization abnormity detection method based on LSTM-seq2seq-attention model
CN115688035A (en) Time sequence power data anomaly detection method based on self-supervision learning
CN111241755A (en) Power load prediction method
CN113642754A (en) Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network
CN114254695B (en) Spacecraft telemetry data self-adaptive anomaly detection method and device
CN111325403A (en) Method for predicting remaining life of electromechanical equipment of highway tunnel
CN112329990A (en) User power load prediction method based on LSTM-BP neural network
CN114994555A (en) Fuel cell residual life prediction method and device based on hybrid model
CN115833937A (en) Optical module fault prediction method based on variational self-encoder and long-and-short-term memory network hybrid model
CN115983087A (en) Method for detecting time sequence data abnormity by combining attention mechanism and LSTM and terminal
CN112948743B (en) Coal mine gas concentration deficiency value filling method based on space-time fusion
CN116028315A (en) Operation early warning method, device, medium and electronic equipment
CN115983488A (en) Method and device for predicting water inflow of mine
Chang et al. Prognostics for remaining useful life estimation in proton exchange membrane fuel cell by dynamic recurrent neural networks
CN117154263A (en) Lithium battery cascade utilization charging and discharging system and control method
CN112381213A (en) Industrial equipment residual life prediction method based on bidirectional long-term and short-term memory network
CN115907198A (en) Long-distance heat supply load intelligent prediction system
CN115330050A (en) Building load prediction method based on hybrid model
CN115759343A (en) E-LSTM-based user electric quantity prediction method and device
CN113919599A (en) Medium-and-long-term load prediction method
CN114595952A (en) Electricity stealing behavior detection method based on attention network improved convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination