CN112733417B - Abnormal load data detection and correction method and system based on model optimization - Google Patents

Abnormal load data detection and correction method and system based on model optimization Download PDF

Info

Publication number
CN112733417B
CN112733417B CN202011278587.4A CN202011278587A CN112733417B CN 112733417 B CN112733417 B CN 112733417B CN 202011278587 A CN202011278587 A CN 202011278587A CN 112733417 B CN112733417 B CN 112733417B
Authority
CN
China
Prior art keywords
data
abnormal
load data
model
load
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011278587.4A
Other languages
Chinese (zh)
Other versions
CN112733417A (en
Inventor
邓松
蔡清媛
岳东
李前亮
袁玲玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202011278587.4A priority Critical patent/CN112733417B/en
Publication of CN112733417A publication Critical patent/CN112733417A/en
Application granted granted Critical
Publication of CN112733417B publication Critical patent/CN112733417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention relates to an abnormal load data detection and correction method and system based on model optimization, wherein the system comprises a load data preprocessor, an abnormal load data detector and an abnormal load data corrector, the load data processor is connected with the abnormal load data detector, and the abnormal load data detector is connected with the abnormal load data corrector. The method is used for processing the abnormal load of the power grid, and the abnormal load data in the power load can be accurately detected by the method, so that the method is favorable for accurate load prediction, planning power consumption management and formulating reasonable power supply construction plan, and is favorable for improving the economic benefit and the social benefit of a power system.

Description

Abnormal load data detection and correction method and system based on model optimization
Technical Field
The invention belongs to the technical field of data mining of power systems, and particularly relates to an abnormal load data detection and correction method based on an improved SVDD and a depth long-time memory network, which is mainly used for abnormal load data detection and correction in the power field.
Background
To meet the ever-increasing energy demand, establishing safe, reliable, environmentally friendly, efficient and friendly power networks has become a research hotspot in the day. The concept of the smart power grid provides a good solution for building a new power grid, meanwhile, the development of the smart power grid promotes the building of a power grid automatic information platform, the data volume of various types transmitted and collected by power system equipment is exponentially increased, and the scale, the type and the structure of load data are greatly changed. In the actual operation process of a power grid, random factors such as system faults, abnormal measuring devices, errors in data transmission, sudden changes in weather, line maintenance and emergencies can inevitably mix acquired load data with some abnormal data which are difficult to find. The quality of the power load data has a decisive influence on the load prediction accuracy and the operation stability of the power grid, and the existence of the abnormal data has a serious influence on the establishment of a load prediction model and the prediction accuracy, so that the predicted load change rule loses guiding significance on the electric energy production and dispatching distribution, and even can influence the safe and stable operation of the power grid.
The method has the advantages that the abnormal data in the power load data are detected and corrected by an effective and accurate method, the accuracy and the integrity of the power load data are guaranteed, the power load prediction accuracy is guaranteed, the electric energy development trend and the electric power use condition in a future period can be accurately estimated, the future electric power load fluctuation condition is predicted, and great help is provided for a power system management department to scientifically and effectively manage the electric power use condition, reduce resource waste, reduce power generation cost, optimize reasonable distribution of electric power resources in a power grid and establish an economical and reasonable power generation plan.
The traditional abnormal load detection method mainly comprises an expert experience method, a state estimation method and curve similarity detection. With the development of data mining technology, a series of intelligent algorithms such as neural network, density analysis, cluster analysis and the like are applied to power abnormal load detection, but the methods have the defects of difficulty in selecting initial parameters, low accuracy rate of abnormal detection and the like.
Under the background, historical power load data is used as training sample data, a Gene Expression Programming (GEP) is adopted to carry out parameter optimization on an SVDD algorithm, an established SVDD model is used for carrying out abnormal load data detection, then a depth long-time memory network (LSTM) is used for load prediction, and the predicted load value is used as a substitute value of the abnormal data. The method deeply researches the detection and correction method of the abnormal load data of the power system, greatly improves the efficiency and the accuracy of the abnormal load detection, thus not only providing stable, proper and reliable power energy for power users, but also strengthening the economic development benefit of power enterprises.
The abnormal load data detection and correction method and system based on model optimization mainly need to consider the problems in two aspects: (1) How to detect abnormal load data by using the improved SVDD algorithm. (2) How to use a depth long-term memory network to predict the load and use the predicted value to replace abnormal load data.
Disclosure of Invention
The invention aims to provide an abnormal load data detection and correction method based on model optimization to specifically solve the problem of abnormal load data of a power grid.
In order to achieve the purpose, the invention is realized by the following technical scheme:
the invention relates to an abnormal load data detection and correction method and system based on model optimization, wherein the detection and correction method comprises the following steps:
the method comprises the following steps: importing all load data of the electricity users, and preprocessing the data, wherein: a mean filling method can be adopted for data with less loss, data with large loss amount is directly deleted, and the step II is carried out;
step two: performing maximum and minimum normalization processing on the historical load data, dividing a training set, a test set and a verification set of an abnormal load data detection model and an abnormal data correction model respectively, and entering a third step;
step three: initializing a population, and entering a fourth step;
step four: c and sigma are calculated, the SVDD model is trained by the C and sigma, k-fold cross validation is carried out, and the fifth step is carried out;
step five: calculating the individual fitness and entering the step six;
step six: keeping the optimal individuals and entering the step seven;
step seven: judging whether a termination condition is reached, if so, entering a ninth step, otherwise, entering an eighth step;
step eight: performing genetic operations such as copying, selecting, mutating and the like to generate a next generation population, and returning to the fourth step for continuous circulation;
step nine: outputting the optimal parameter combination of the SVDD and entering the step ten;
step ten: establishing an SVDD abnormal data detection model by using the optimal parameter combination C and sigma combination, and entering the step eleven;
step eleven: calculating the sphere center distance r from the sample to be measured to the hyper-sphere, and entering the step twelve;
step twelve: if the center distance R is larger than the radius R of the hypersphere, the data is abnormal load data, and the step thirteen is carried out;
step thirteen: processing the time series data after data preprocessing by using a sliding window to obtain m load sample sets with the length of l, wherein 70% of historical load data is a training set, 20% is a verification set, and 10% is a test set, and entering a fourteenth step;
fourteen steps: iteratively training an LSTM load prediction model, selecting training data from a training data set in each training period, inputting the training data into a depth time memory network for network training, and entering the step fifteen;
step fifteen: adjusting model parameters, namely evaluating the prediction error of the load prediction model through a test set, if the accuracy requirement is not met, adjusting the model parameters through a verification set, and entering the step sixteen;
sixthly, the steps are as follows: inputting relevant data before abnormal data occurs into a trained LSTM model for prediction, finally outputting a predicted value of the load to be predicted, and entering the seventeenth step;
seventeen steps: and the abnormal data correction part replaces the abnormal load value with the load predicted value and ends.
The invention relates to an abnormal load data detection and correction method and system based on model optimization, wherein the system comprises the following steps:
a load data preprocessor: by taking all load data as a whole, firstly performing missing data filling or deleting operation, performing normalization processing on the load data, and dividing a sample set into a training set, a verification set and a test set;
the specific method comprises the following steps: (1) data washing and normalization: the data cleaning is to adopt a method of mean filling of front and back adjacent points for data with less missing values, and directly delete data with larger missing amount. The influence of factors such as large numerical value difference or non-uniform variable dimension in the data set is avoided. The data needs to be normalized. Mapping the raw data to [0,1 ] using max-min normalization]Interval, transfer function of
Figure BDA0002779972240000041
X is normalized data, X is original data, Xmin、XmaxThe minimum and maximum values of the original data set.
(2) Dividing a sample set: before constructing an abnormal load data detection model and an abnormal data correction model, the sample data set is divided into 7:2:1, dividing a training set, a verification set and a test set in proportion, wherein the training set is used for training parameters of a model, the verification set is used for adjusting and optimizing model parameters, and the test set is used for evaluating the model.
Abnormal load data detector: firstly, carrying out punishment parameter C and kernel parameter sigma optimization of a Gaussian kernel function on Support Vector Data Description (SVDD) by using Gene Expression Programming (GEP), determining optimal C and sigma, establishing an SVDD model, then calculating the distance from a sample to be tested to the center of a hypersphere of the SVDD model, and comparing the distance with a threshold value to judge whether the sample is abnormal load data or not;
the traditional abnormal load detection method mainly comprises an expert experience method, a state estimation method and curve similarity detection. With the development of data mining technology, a series of intelligent algorithms such as neural network, density analysis, cluster analysis and the like are applied to power abnormal load detection, but the methods have the defects of difficult initial parameter selection, low accuracy of abnormal detection and the like, and under the background, a support vector data description parameter optimization method (GEP-SVDD) based on gene expression programming is applied to abnormal data detection, and support vector description (SVDD) isThe method refers to a single classification method, and the core idea of the method is as follows: given sample set X = { X = ×)1,x2,x3,...,xnAnd mapping the samples to a high-dimensional feature space F by a nonlinear mapping ψ, and constructing a supersphere Ω = (o, R) with the smallest volume that can contain all or as many samples as possible in F, o being the center of the sphere, and R being the radius of the supersphere. Constructing a hypersphere is an optimization problem as in formula (1):
Figure BDA0002779972240000042
where C is a penalty factor, ξiIs a relaxation factor. Solving the formula (1) by using a Lagrange multiplier method, and introducing a Lagrange multiplier alphai≥0,γiIs more than or equal to 0, obtaining an expression (2):
Figure BDA0002779972240000051
let L respectively correspond to R, o, xiiCalculating partial derivatives, and calculating light distribution by using Gaussian kernel function K (x, y) = exp (| x-y |)22 sigma) instead of inner product < psi (x)i),ψ(xj) >. The original optimization problem becomes formula (3):
Figure BDA0002779972240000052
the square of the distance from any test sample x to the center o of the hypersphere is:
Figure BDA0002779972240000053
the expression of the radius R of the hypersphere is as follows:
Figure BDA0002779972240000054
when r is2≤R2Normal data; when r is2≥R2And detecting abnormal data.
It can be deduced from the above theory that two model parameters play important roles in the process of establishing the SVDD model: the penalty parameter C and the kernel parameter σ in the gaussian kernel function need to be optimized to improve the performance of SVDD.
The specific steps of detecting abnormal load data based on the SVDD model are as follows:
1) And (5) establishing an SVDD abnormal data detection model by using the optimal parameter combination C and sigma combination.
2) And calculating the sphere center distance r from the sample to be measured to the hyper-sphere.
3) And if the center distance R is larger than the radius R of the hyper-sphere, the data is abnormal load data.
(1) SVDD algorithm parameter optimization based on Gene Expression Programming (GEP)
Gene Expression Programming (GEP) is a new evolutionary algorithm proposed by Candida Ferreira on the basis of Genetic Algorithm (GA) and Genetic Programming (GP), which integrates simple linear individual fixed-length codes of genetic algorithm GA and variable and elastic tree structures of genetic programming GP, and designs individual expressions into linear and fixed-length tree structures, so as to improve the running efficiency of GEP, and GEP has strong heuristic random search function and excellent performance in solving optimization problems. No document is found at present, GEP is applied to parameter selection of support vector data description, and the patent applies gene expression programming to optimize parameters C and sigma of an SVDD algorithm to form a support vector data description parameter optimization method (GEP-SVDD) based on the gene expression programming. The method comprises the steps of establishing an SVDD model by selecting a parameter combination C and sigma, training the SVDD model by using a training sample set, detecting a test sample to obtain an accuracy rate, and describing a fitness function of the parameter combination according to the accuracy rate. The fitness function calculation method adopts a k-fold cross validation method, namely, multiple SVDD training and prediction are carried out on each parameter combination, and the obtained F is used1To determine fitness value of the individual. Wherein
Figure BDA0002779972240000061
Is an index for measuring the accuracy of the two-classification model, precision is precision
Figure BDA0002779972240000062
recall is the recall rate
Figure BDA0002779972240000063
TP is the predicted result of the positive class and is also the positive class, FN is the predicted result of the positive class and is the negative class, FP is the predicted result of the negative class and is the positive class, TN is the predicted result of the negative class and is also the negative class. And outputting the optimal parameter combination of the SVDD model when the fitness meets the termination requirement.
The method for optimizing the Support Vector Data Description (SVDD) parameters based on gene expression programming comprises the following specific steps:
1) And initializing the population.
2) C and sigma are calculated, and are used for training the SVDD model and performing k-fold cross validation.
3) And calculating the individual fitness.
4) The optimal individuals are retained.
5) And (4) judging whether the termination condition is reached, if so, entering the step (7), and if not, entering the next step.
6) And (4) performing genetic operations such as copying, selecting, mutating and the like to generate a next generation population, and returning to the step 2 to continue the circulation.
7) And outputting the optimal parameter combination of the SVDD.
Abnormal load data modifier: load prediction is carried out based on a depth long-term memory network (LSTM) model, historical load data is used as a training set to train the model, relevant data before abnormal data appear is input into the model after training is finished, a predicted value is obtained, and the predicted value is used for replacing the abnormal load data.
After the abnormal load data is detected, in order to ensure the accuracy and the integrity of the load data, the abnormal load data needs to be corrected, and a depth long-time and short-time memory network (LSTM) is used as a basic model. The load data can be regularly mined by the characteristic extraction capability and the time sequence correlation learning capability of the depth long-time memory network. The long and short time memory network (LSTM) is used as a deep neural network and mainly comprises a forgetting gate, an input gate and an output gate, and the calculation process of each variable of the long time memory network is as follows:
ft=σ(Wa[ht-1,xt]+ba) (6)
it=σ(Wm[ht-1,xt]+bm) (7)
ct=tanh(Wc[ht-1,xt]+bc) (8)
st=st-1⊙ft+it⊙ct (9)
ot=σ(Wd[ht-1,xt]+bd) (10)
gt=ot⊙tanh(st) (11)
ftis forgetting the gate value, itIs the input gate value, ctIs a candidate moment memory state quantity, stIs the memory state quantity at the current moment, otIs the output gate value; x is the number oftIs an input value, gtIs the output value; wa,Wm,Wc,WdIs a weight matrix of the input variables among them, ba,bm,bc,bdIs the bias term in the corresponding gate; the indicator indicates multiplication of elements of the vector, and the sigma indicates a sigmoid activation function; the subscript t represents the current time and t-1 represents the previous time.
The abnormal load data correction model training process based on the depth long-short time memory network specifically comprises the following steps:
1) And processing the time series data after data preprocessing by using a sliding window to obtain m load sample sets with the length of l, wherein 70% of historical load data is a training set, 20% is a verification set, and 10% is a test set.
2) And (5) iteratively training an LSTM load prediction model. In each training period, training data are selected from the training data set, and the training data are input into a depth duration memory network for network training.
3) And adjusting model parameters. And evaluating the prediction error of the load prediction model through the test set, and if the accuracy requirement is not met, adjusting the model parameters through the verification set.
4) And inputting relevant data before the abnormal data appears into the trained LSTM model for prediction, and finally outputting a load predicted value to be predicted.
5) And the abnormal data correction part replaces the abnormal load value with the load predicted value and ends.
The invention has the beneficial effects that: the method is a strategic method, wherein data preprocessing is carried out on user load data, an abnormal data detector is established through a gene expression programming improvement SVDD algorithm, a load prediction model is established by adopting a deep long-term and short-term memory network, a reasonable load value is predicted by utilizing historical load data before the abnormal data appears, and the predicted value is used for replacing the abnormal load data, so that the abnormal load data is corrected.
The method is mainly used for processing the abnormal load of the power grid, and can accurately detect the abnormal load data in the power load, so that the method is favorable for accurate load prediction, planning power utilization management and formulating reasonable power supply construction plan, and is favorable for improving the economic benefit and the social benefit of a power system.
Drawings
FIG. 1 is a block diagram of the system for abnormal load data detection and correction according to the present invention.
FIG. 2 is a flow chart of the abnormal load data detection and correction method according to the present invention.
Detailed Description
In the following description, for purposes of explanation, numerous implementation details are set forth in order to provide a thorough understanding of the embodiments of the invention. It should be understood, however, that these implementation details are not to be interpreted as limiting the invention. That is, in some embodiments of the invention, such implementation details are not necessary. In addition, some conventional structures and components are shown in simplified schematic form in the drawings.
The development of the intelligent power grid promotes the establishment of a power grid automatic information platform, the data volume of various types transmitted and collected by power system equipment is exponentially increased, and in the actual operation process of the power grid, the collected load data is inevitably doped with some abnormal data which are difficult to find due to random factors such as system faults, abnormal measuring devices, data transmission errors and the like. The quality of the power load data has decisive influence on the load prediction precision and the power grid operation stability, an effective and accurate method is adopted to establish a model, abnormal data in the power load data are detected and corrected, the accuracy and the integrity of the power load data are ensured, and the method becomes an important research direction.
The invention relates to an abnormal load data detection and correction method and system based on model optimization, wherein the detection and correction method comprises the following steps:
the method comprises the following steps: importing all load data of the electricity users, and preprocessing the data, wherein: a mean filling method can be adopted for data with less loss, data with larger loss amount is directly deleted, and the step II is carried out;
step two: performing maximum and minimum normalization processing on historical load data, dividing a training set, a test set and a verification set of an abnormal load data detection model and an abnormal data correction model respectively, and entering a third step;
step three: initializing a population and entering a step four;
step four: c and sigma are calculated, the SVDD model is trained by the C and sigma, k-fold cross validation is carried out, and the step five is carried out;
step five: calculating the individual fitness and entering the step six;
step six: keeping the optimal individuals and entering the step seven;
step seven: judging whether a termination condition is reached, if so, entering a ninth step, otherwise, entering an eighth step;
step eight: performing genetic operations such as copying, selecting, mutating and the like to generate a next generation population, and returning to the step four to continue circulation;
step nine: outputting the optimal parameter combination of the SVDD and entering the step ten;
step ten: establishing an SVDD abnormal data detection model by using the optimal parameter combination C and sigma combination, and entering the step eleven;
step eleven: calculating the sphere center distance r from the sample to be measured to the hyper-sphere, and entering the step twelve;
step twelve: if the center distance R is larger than the radius R of the hypersphere, the data is abnormal load data, and the step thirteen is carried out;
step thirteen: processing the time series data after data preprocessing by using a sliding window to obtain m load sample sets with the length of l, wherein 70% of historical load data is a training set, 20% is a verification set, and 10% is a test set, and entering a fourteenth step;
fourteen steps: performing iterative training on the LSTM load prediction model, selecting training data from a training data set in each training period, inputting the training data into a depth time memory network for network training, and entering the step fifteen;
step fifteen: adjusting model parameters, namely evaluating the prediction error of the load prediction model through a test set, if the accuracy requirement is not met, adjusting the model parameters through a verification set, and entering the step sixteen;
sixthly, the step of: inputting relevant data before abnormal data occurs into a trained LSTM model for prediction, finally outputting a predicted value of the load to be predicted, and entering the seventeenth step;
seventeen steps: and in the abnormal data correction part, replacing the abnormal load value with the load predicted value, and ending.
The invention relates to a system for detecting and correcting abnormal load data, which comprises:
a load data preprocessor: by taking all load data as a whole, firstly carrying out missing data filling or deleting operation, carrying out normalization processing on the load data, and then dividing a sample set into a training set, a verification set and a test set;
abnormal load data detector: firstly, carrying out punishment parameter C and kernel parameter sigma optimization of a Gaussian kernel function on Support Vector Data Description (SVDD) by using Gene Expression Programming (GEP), determining optimal C and sigma, establishing an SVDD model, then calculating the distance from a sample to be tested to the hypersphere center of the SVDD model, and comparing the distance with a threshold value to judge whether the sample is abnormal load data or not;
abnormal load data modifier: load prediction is carried out based on a depth long-term memory network (LSTM) model, historical load data is used as a training set to train the model, relevant data before abnormal data appear is input into the model after training is finished, a predicted value is obtained, and the predicted value is used for replacing the abnormal load data.
The load data processor is connected with the abnormal load data detector, and the abnormal load data detector is connected with the abnormal load data corrector.
The first embodiment is as follows:
if there is accurate historical load data in a residential area for a period of time, and abnormal load data judgment is to be performed on data of a certain day. Firstly, taking historical power load data as training sample data, adopting Gene Expression Programming (GEP) to carry out parameter optimization on an SVDD algorithm, utilizing an established SVDD model to carry out abnormal load data detection on a day to be detected, then utilizing a depth long-term memory network (LSTM) to carry out load prediction if abnormal load data exists, and taking a predicted load value as a substitute value of the abnormal data.
The specific implementation scheme is as follows:
(1) The method comprises the steps of preprocessing historical load data, adopting a mean filling method for data with few missing values, and directly deleting data with large missing quantity. And (5) normalizing all load data.
(2) And respectively dividing a training set, a testing set and a verification set of the abnormal load data detection model and the abnormal data correction model.
(3) And performing parameter optimization on the parameters C and sigma of the SVDD algorithm by applying gene expression programming to obtain the optimal parameters C and sigma, and establishing an abnormal data detection model based on the improved SVDD.
(4) And calculating the distance R from the sample to be detected to the center of the model sphere by adopting an abnormal data detection model of the improved SVDD, and judging as abnormal load data if R is greater than the model hypersphere distance R.
(5) Iterative training and model parameter adjustment are carried out on the basis of the depth long-short time memory network load prediction model.
(6) And inputting relevant data before the abnormal data appears into the trained LSTM model for prediction, and finally outputting a load predicted value to be predicted to replace the abnormal load value.
The method adopts Gene Expression Programming (GEP) to carry out parameter optimization on the SVDD algorithm, utilizes the SVDD model established by optimal parameters to carry out abnormal load data detection, then utilizes a depth long-term memory network (LSTM) to carry out load prediction, and takes the predicted load value as a substitute value of the abnormal data.
The above description is only an embodiment of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (2)

1. An abnormal load data detection and correction method based on model optimization is characterized in that: the detection and correction method comprises the following steps:
the method comprises the following steps: importing all load data of the electricity users, and preprocessing the data, wherein: a mean filling method can be adopted for data with less loss, data with large loss amount is directly deleted, and the step II is carried out;
step two: performing maximum and minimum normalization processing on the historical load data, dividing a training set, a test set and a verification set of an abnormal load data detection model and an abnormal data correction model respectively, and entering a third step;
step three: initializing a population and entering a step four;
step four: calculating a punishment parameter C and a kernel parameter sigma of the Gaussian kernel function, training the SVDD model by using the punishment parameter C and the kernel parameter sigma of the Gaussian kernel function, performing k-fold cross validation, and entering the step five;
step five: calculating the individual fitness and entering the step six;
step six: keeping the optimal individuals and entering the step seven;
step seven: judging whether a termination condition is reached, if so, entering a ninth step, otherwise, entering an eighth step;
step eight: copying, selecting and mutating genetic operations to generate a next generation population, and returning to the step four for continuous circulation;
step nine: outputting the optimal parameter combination of the SVDD and entering the step ten;
step ten: establishing an SVDD abnormal data detection model by using the optimal parameter combination C and sigma combination, and entering the step eleven;
step eleven: calculating the sphere center distance r from the sample to be measured to the hyper-sphere, and entering the step twelve;
step twelve: if the center distance R is larger than the radius R of the hypersphere, the data is abnormal load data, and the step thirteen is carried out;
step thirteen: processing the time series data after data preprocessing by using a sliding window to obtain m load sample sets with the length of l, wherein 70% of historical load data is a training set, 20% of historical load data is a verification set, and 10% of historical load data is a test set, and entering a fourteenth step;
fourteen steps: performing iterative training on the LSTM load prediction model, selecting training data from a training data set in each training period, inputting the training data into a depth time memory network for network training, and entering the step fifteen;
step fifteen: adjusting model parameters, namely evaluating the prediction error of the load prediction model through a test set, if the accuracy requirement is not met, adjusting the model parameters through a verification set, and entering the step sixteen;
sixthly, the steps are as follows: inputting relevant data before abnormal data occurs into a trained LSTM model for prediction, finally outputting a predicted value of the load to be predicted, and entering the seventeenth step;
seventeen steps: and the abnormal data correction part replaces the abnormal load value with the load predicted value and ends.
2. The abnormal load data detection and correction method based on model optimization according to claim 1, characterized in that: the detection and correction method is realized by a system, and the system comprises:
a load data preprocessor: by taking all load data as a whole, firstly carrying out missing data filling or deleting operation, carrying out normalization processing on the load data, and then dividing a sample set into a training set, a verification set and a test set;
abnormal load data detector: firstly, carrying out punishment parameter C and kernel parameter sigma optimization of a Gaussian kernel function on support vector data description SVDD by using a gene expression programming GEP (generalized likelihood of being abnormal) to determine optimal C and sigma, establishing an SVDD model, then calculating the distance from a sample to be tested to the center of a hypersphere of the SVDD model, and comparing the distance with a threshold value to judge whether the support vector data description SVDD is abnormal load data or not;
abnormal load data modifier: load prediction is carried out based on a depth long-term memory network LSTM model, historical load data is used as a training set to train the model, relevant data before abnormal data appear is input into the model after training is finished, a predicted value is obtained, and the predicted value is used for replacing the abnormal load data.
CN202011278587.4A 2020-11-16 2020-11-16 Abnormal load data detection and correction method and system based on model optimization Active CN112733417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011278587.4A CN112733417B (en) 2020-11-16 2020-11-16 Abnormal load data detection and correction method and system based on model optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011278587.4A CN112733417B (en) 2020-11-16 2020-11-16 Abnormal load data detection and correction method and system based on model optimization

Publications (2)

Publication Number Publication Date
CN112733417A CN112733417A (en) 2021-04-30
CN112733417B true CN112733417B (en) 2022-11-01

Family

ID=75597493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011278587.4A Active CN112733417B (en) 2020-11-16 2020-11-16 Abnormal load data detection and correction method and system based on model optimization

Country Status (1)

Country Link
CN (1) CN112733417B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609635A (en) * 2021-05-21 2021-11-05 上海交通大学 Framework and identification method of socket load type identification system
CN113410839B (en) * 2021-06-24 2022-07-12 燕山大学 Detection method and system for false data injection of power grid
CN113516317A (en) * 2021-07-30 2021-10-19 广东电网有限责任公司 Energy planning prediction method and device based on neural network
CN114443635B (en) * 2022-01-20 2024-04-09 广西壮族自治区林业科学研究院 Data cleaning method and device in soil big data analysis
CN115442271B (en) * 2022-08-29 2023-09-26 云南电网有限责任公司迪庆供电局 Network performance index time sequence data anomaly detection method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334726A (en) * 2019-04-24 2019-10-15 华北电力大学 A kind of identification of the electric load abnormal data based on Density Clustering and LSTM and restorative procedure

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334726A (en) * 2019-04-24 2019-10-15 华北电力大学 A kind of identification of the electric load abnormal data based on Density Clustering and LSTM and restorative procedure

Also Published As

Publication number Publication date
CN112733417A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN112733417B (en) Abnormal load data detection and correction method and system based on model optimization
CN112949945B (en) Wind power ultra-short-term prediction method for improving bidirectional long-term and short-term memory network
CN108921339B (en) Quantile regression-based photovoltaic power interval prediction method for genetic support vector machine
CN103105246A (en) Greenhouse environment forecasting feedback method of back propagation (BP) neural network based on improvement of genetic algorithm
CN110751318A (en) IPSO-LSTM-based ultra-short-term power load prediction method
CN111861013B (en) Power load prediction method and device
CN113344288B (en) Cascade hydropower station group water level prediction method and device and computer readable storage medium
CN112100911B (en) Solar radiation prediction method based on depth BILSTM
CN111736084A (en) Valve-regulated lead-acid storage battery health state prediction method based on improved LSTM neural network
CN113537469B (en) Urban water demand prediction method based on LSTM network and Attention mechanism
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
Ning et al. GA-BP air quality evaluation method based on fuzzy theory.
CN112990500A (en) Transformer area line loss analysis method and system based on improved weighted gray correlation analysis
CN112819225A (en) Carbon market price prediction method based on BP neural network and ARIMA model
CN112364560A (en) Intelligent prediction method for working hours of mine rock drilling equipment
CN113971517A (en) GA-LM-BP neural network-based water quality evaluation method
CN113743538A (en) Intelligent building energy consumption prediction method, equipment and medium based on IPSO-BP neural network
CN113780684A (en) Intelligent building user energy consumption behavior prediction method based on LSTM neural network
CN115982141A (en) Characteristic optimization method for time series data prediction
CN114357670A (en) Power distribution network power consumption data abnormity early warning method based on BLS and self-encoder
CN114021432A (en) Stress corrosion cracking crack propagation rate prediction method and system
CN113376540A (en) LSTM battery health state estimation method based on evolution attention mechanism
CN113033898A (en) Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN117034762A (en) Composite model lithium battery life prediction method based on multi-algorithm weighted sum
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant