CN114881372A - DPC-GRNN-based ultra-short-term industrial electrical load prediction method - Google Patents

DPC-GRNN-based ultra-short-term industrial electrical load prediction method Download PDF

Info

Publication number
CN114881372A
CN114881372A CN202210795337.0A CN202210795337A CN114881372A CN 114881372 A CN114881372 A CN 114881372A CN 202210795337 A CN202210795337 A CN 202210795337A CN 114881372 A CN114881372 A CN 114881372A
Authority
CN
China
Prior art keywords
electrical load
industrial electrical
data set
historical data
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210795337.0A
Other languages
Chinese (zh)
Inventor
田昕
朱庆春
程改红
郭相国
黄娟娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Power Engineering Consultant Group Central Southern China Electric Power Design Institute Corp
Original Assignee
China Power Engineering Consultant Group Central Southern China Electric Power Design Institute Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Power Engineering Consultant Group Central Southern China Electric Power Design Institute Corp filed Critical China Power Engineering Consultant Group Central Southern China Electric Power Design Institute Corp
Priority to CN202210795337.0A priority Critical patent/CN114881372A/en
Publication of CN114881372A publication Critical patent/CN114881372A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Educational Administration (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a DPC-GRNN-based ultra-short-term industrial electric load prediction method, which comprises the following steps: preprocessing industrial electrical load historical data of a place to be predicted to form an industrial electrical load historical data set, and correcting abnormal values in the industrial electrical load historical data set; carrying out normalization processing on the corrected industrial power load historical data set of the site to be predicted; DPC clustering analysis is carried out on the normalized industrial power load historical data set of the site to be predicted, and a corresponding cluster is obtained; respectively constructing a GRNN model for each cluster; calculating the SPREAD value corresponding to each GRNN model, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model; and inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing inverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted. The invention effectively provides higher prediction precision of the ultra-short-term industrial electrical load.

Description

DPC-GRNN-based ultra-short-term industrial electrical load prediction method
Technical Field
The invention belongs to the technical field of industrial electricity, and particularly relates to a DPC-GRNN-based ultra-short-term industrial electricity load prediction algorithm.
Background
In an effort to solve the outstanding contradiction and deep-level problems of the power industry, structural transformation and industry upgrading are pushed. Therefore, the method has important significance in mining the power utilization data and power utilization behaviors of the users, mastering the power utilization rules of the users and carrying out accurate load prediction.
The electric power high energy consumption industry, its total load is big, and volatility is strong, has certain impact load, has great influence to electric power system, threatens electric power system safety and stability and electric energy quality. Therefore, the factors influencing the industrial power load with high energy consumption are scientifically analyzed, the purpose of improving the accuracy of load prediction is achieved, the production mode is adjusted according to the purpose, and the stable operation of a power system is guaranteed.
Currently, the load prediction for high-energy-consumption industrial users mainly focuses on medium-and long-term load prediction, and the main methods and the disadvantages thereof include:
(1) constructing a load prediction model by using a classified modeling idea aiming at the load fluctuation characteristics of high-energy-consumption industrial users; but its classification principle relies on subjective judgment.
(2) An FCM clustering method is used for researching the load characteristics of industrial users; however, the FCM clustering algorithm is prone to fall into a local saddle point, so that the prediction accuracy is not high.
(3) Establishing an industrial electrical load prediction model by a genetic membrane optimization BP neural network; the prediction model has large subjectivity and low prediction precision.
Disclosure of Invention
The invention aims to solve the defects of the background technology, provides the DPC-GRNN-based ultra-short-term industrial electrical load prediction algorithm, performs clustering analysis on load data by using the DPC algorithm with better clustering effect, and then respectively establishes prediction models for clusters obtained by clustering, so that the prediction precision is higher.
The technical scheme adopted by the invention is as follows: a DPC-GRNN-based ultra-short-term industrial electrical load prediction method comprises the following steps:
preprocessing industrial electric load historical data of a place to be predicted to form an industrial electric load historical data set,
correcting abnormal values in the historical data set of the industrial electrical load;
carrying out normalization processing on the corrected industrial power load historical data set of the site to be predicted;
DPC clustering analysis is carried out on the normalized industrial power load historical data set of the site to be predicted, and a corresponding cluster is obtained;
respectively constructing a GRNN model for each class cluster, and training the corresponding GRNN model by using the industrial electrical load historical data set corresponding to each class cluster as a training set;
calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the place to be predicted, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model;
and inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing inverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted.
In the technical scheme, the historical data of the industrial electric load of the place to be predicted is a time sequence of the industrial electric load acquired according to a set time period; the current real-time industrial electrical load data of the site to be predicted is a time sequence containing the current industrial electrical load data.
In the technical scheme, the current real-time industrial electrical load data of the site to be predicted are preprocessed to form an industrial electrical load data set, and abnormal values in the industrial electrical load data set are corrected; and inputting the corrected industrial electrical load data set into a prediction model.
In the technical scheme, the industrial electrical load historical data set is obtained by adopting the following formula:
Figure 862759DEST_PATH_IMAGE001
wherein s represents the total number of peaks and troughs in the load sequence of the industrial electrical load history, and t i Representing time sequences corresponding to the wave crest and the wave trough respectively, wherein i represents the number of data points in the historical data set of the industrial electrical load, i =1,2,3,.. N, and N represents the number of the data points in the historical data set of the industrial electrical load; wherein 1 ≦ i ≦ s.
In the above technical solution, the process of correcting the abnormal value of the industrial electrical load historical data set of the location to be predicted includes:
repairing and filling the industrial electrical load historical data set of the site to be predicted in a curve fitting mode;
automatically finding abnormal data and performing transverse correction by comparing whether the historical data of the industrial electric load in the two time periods before and after are in the same dimension;
and correcting abnormal mutation of the historical data of the industrial electric load with fine granularity by a longitudinal correction method based on the same time point data of the previous time period and the next time period.
In the above technical solution, the process of performing DPC cluster analysis on the load feature vector to obtain a corresponding class cluster includes: smoothing the normalized industrial electrical load historical data to obtain the normalized industrial electrical load historical data;
calculating the local density of each data point in the industrial electrical load historical data set based on the Euclidean distance between every two data points;
performing descending order arrangement on the obtained local density of each data point, and forming a sequence number set based on the sequence number distribution of the local density in the sequence;
calculating the minimum value of the Euclidean distance between each data point in the sequence number set and other data points as the density distance of the data points in the industrial electrical load historical data set corresponding to the sequence number;
plotting a decision graph based on the local density and density distance of each data point in the historical data set of the industrial electrical load;
selecting a data point which is positioned at the upper right of the decision graph and is different from the corresponding data point of other points as a clustering center;
the remaining data points are assigned to the cluster of classes in which the closest and locally higher density data points are located.
In the above technical solution, the process of training the corresponding GRNN model by using the industrial electrical load historical data corresponding to each class cluster as a training set includes:
forming a sample set corresponding to each class cluster based on the data point corresponding to each class cluster;
respectively performing fold-cross validation on the number of samples of each class cluster, and dividing a sample set of a certain class cluster into K sub-sample sets, wherein K is a positive integer;
taking one sub-sample set as a test set in turn, taking the rest K-1 sub-samples as a training set, training the GRNN model of the cluster, and repeating for K times;
in the K training processes of the GRNN model of any one class cluster, circularly selecting the SPREAD value of the GRNN model during each training;
and selecting a GRNN model generated by the training set and the SPREAD value corresponding to the minimum mean square error as the GRNN model of the cluster.
In the technical scheme, the normalized industrial electrical load historical data is smoothed to obtain the industrial electrical load historical data set of the site to be predicted
Figure 468315DEST_PATH_IMAGE002
,x i Any data point in the historical data set representing the industrial electrical load; and construct a corresponding set of metrics
Figure 920156DEST_PATH_IMAGE003
(ii) a N, N denotes an industrial electrical load historical data setThe number of data points in;
calculating local density rho of each data point in industrial electrical load historical data set based on distance between every two data i
Figure 784207DEST_PATH_IMAGE004
In the formula (d) ij Represents the data point x i And x j Of the Euclidean distance between d c Denotes the truncation distance, p i Representing industrial electrical load data sets
Figure 785530DEST_PATH_IMAGE005
Median data point x i Is less than d c The number of points of (a);
setting a data set
Figure 342413DEST_PATH_IMAGE006
Representing local density sets
Figure 32282DEST_PATH_IMAGE007
In descending order of (a) is ordered,
Figure 965603DEST_PATH_IMAGE006
satisfies the following conditions:
Figure 306586DEST_PATH_IMAGE008
each data point is calculated by the above formula
Figure 283638DEST_PATH_IMAGE009
Is/are as follows
Figure 710071DEST_PATH_IMAGE010
Using a two-dimensional graph to represent all data points
Figure 181504DEST_PATH_IMAGE010
And performing representation to obtain a decision diagram.
In the above technical solution, the euclidean distance between every two data points in the industrial electrical load historical data set is calculated by using the following formula:
Figure 223383DEST_PATH_IMAGE011
in the formula, x ik And x jk For industrial electrical load historical data set x i And x j The kth-dimension element of (1);
distance of truncationd c The selection process comprises the following steps:
and (3) carrying out ascending arrangement on Euclidean distance values between every two data points in the obtained industrial electrical load historical data set by calculation:
Figure 122069DEST_PATH_IMAGE012
(ii) a Taking the truncation distanced c =d n Subscript thereofn=[0.02N],[ ]Is a rounding function.
In the technical scheme, for the condition that the clustering center is difficult to judge by naked eyes in the decision diagram, an index gamma comprehensively considering the local density and the density distance is defined i
Figure 550645DEST_PATH_IMAGE013
For the index data set
Figure 497872DEST_PATH_IMAGE014
Arranged in descending order and drawn as gamma i A two-dimensional coordinate graph with a vertical axis and a data point subscript i in the industrial electrical load historical data set as a horizontal axis; gamma corresponding to non-cluster central point i The values are relatively smooth, and the clustering central points and the gamma corresponding to the non-clustering central points i There is a jump in value.
The invention provides a computer-readable storage medium, wherein a DPC-GRNN-based ultra-short-term industrial electric load prediction method program is stored on the computer-readable storage medium, and when being executed by a processor, the DPC-GRNN-based ultra-short-term industrial electric load prediction method program realizes the steps of the DPC-GRNN-based ultra-short-term industrial electric load prediction method according to the technical scheme.
The invention has the beneficial effects that: the invention provides a DPC-GRNN-based ultra-short-term industrial electricity load prediction algorithm which can provide important basis and reference for large-scale users to purchase electricity. In view of the fact that traditional clustering is easy to enter local saddle points and depends on initialization data, the DPC algorithm adopted by the invention has the advantages of fast convergence, high robustness, no need of manually setting the optimal clustering number and the like, can more accurately cluster original load data, does not need to manually appoint a clustering center and the clustering number, has better applicability in the aspect of clustering the original data before load prediction of a large user, can automatically determine the clustering center and the clustering number, quickly searches and finds a density peak value of a data point, can obtain more accurate clusters, and effectively analyzes power utilization behaviors of the user. Based on the load data analysis result of the DPC algorithm, the invention constructs a GRNN prediction model for each cluster to carry out load prediction, and the model has higher prediction precision. When the GRNN model is constructed, according to the difference of the sample numbers of different clusters, a K-fold cross validation training model is selected, the SPREAD value is selected in a circulating mode, and then the GRNN neural network is constructed according to the optimal value. The prediction precision is higher, can better instruct the user to purchase the electricity rationally. The invention adopts the preprocessed data set before and after constructing the model and uses the model prediction to ensure the periodicity of the load sequence. The invention analyzes and corrects the abnormal value of the data before and after the model is constructed and when the model is used for prediction, thereby further strengthening the prediction precision of the invention.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a decision diagram of the present embodiment;
FIG. 3 is a diagram illustrating descending order of the decision graph according to the present embodiment;
fig. 4 is a schematic diagram of a cluster fluctuation situation in this embodiment.
Detailed Description
The invention will be further described in detail with reference to the following drawings and specific examples, which are not intended to limit the invention, but are for clear understanding.
As shown in fig. 1, the present invention provides a DPC-GRNN-based ultra-short-term industrial electrical load prediction method, which includes the following steps:
s1, preprocessing the industrial electric load historical data of the site to be predicted to form an industrial electric load historical data set,
s2, correcting abnormal values in the industrial electrical load historical data set;
s3, carrying out normalization processing on the corrected historical data set of the industrial power load of the site to be predicted;
s4, carrying out DPC cluster analysis on the normalized industrial power load historical data set of the site to be predicted to obtain a corresponding cluster;
s5, respectively constructing GRNN models for each class cluster, and training the corresponding GRNN models by taking the industrial electrical load historical data set corresponding to each class cluster as a training set;
s6, calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the place to be predicted, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model;
and S7, inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing reverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted.
The ultra-short-term industrial power load prediction method meets the ultra-short-term load prediction requirements of GB/T31464-:
a) and predicting the next 5min or 10min or 15min of the current moment.
b) On the basis of real-time power utilization load, the ultra-short-term load prediction is completed by combining date types such as working days and rest days and the characteristics of historical load.
The load data source adopted by the specific embodiment of the invention is data collected by a cement company in a certain city through a gateway table. The time span is from 5/1/2018 to 31/2018/12/245/day, data collection is performed every 15 minutes for 96 points/day.
In the process of predicting the short-term load, the power load has a certain periodic characteristic and is also influenced by other external influence factors, so that the reasonable input set is very necessary to be determined before the prediction model is built.
In order to determine a reasonable input set X for the subsequent prediction model, in step S1, an industrial electrical load historical data set is obtained by using the following calculation:
Figure 875764DEST_PATH_IMAGE001
wherein s represents the total number of peaks and troughs of the load sequence in the historical data of the industrial electrical load, and t i Representing time sequences corresponding to the wave crest and the wave trough respectively, wherein i represents the number of data points in the industrial electrical load historical data set, and i =1,2, 3.. N, and N represents the number of the data points in the industrial electrical load historical data set; wherein 1 ≦ i ≦ s.
The situation of abnormal data in the real data set is inevitable, and in the work of load prediction, the abnormal data can be divided into two types: data misses and data errors. In step S2, the following method is used to correct the abnormal data.
The loss of the industrial electrical load data is generally caused by the loss of the working data records, and if the front span and the rear span of the lost data are not large, the data can be repaired and filled in by using a curve fitting mode. Wherein the definition of the curve fitting is assumed to be:
Figure 633767DEST_PATH_IMAGE015
f represents an abstract function of curve fitting; wherein y represents the historical data set of the industrial electrical load to be corrected, a 1 --a n Representing the coefficient to be solved, g representing a function for solving the missing data; let g 1 (y)=1,g 2 (y) = y, and so on. Will be short ofLoad data at the moment before the moment of data loss and load data at the moment after the moment are substituted, and the load data can be solved by means of the principle of least square method and extreme value concept
Figure 97109DEST_PATH_IMAGE016
And a function of the fitting curve of the industrial electrical load data is obtained. The present embodiment can determine a correction value for the missing industrial electrical load data by this function.
Data errors are very common in the context of power load prediction, and generally represent sudden changes in load changes at a certain time. The sudden change of the data causes unnecessary noise pollution to the subsequent model learning load change rule, so that the abnormal data needs to be correspondingly checked and processed. In the face of inspection and correction of such data, the cost of manual inspection process is very high, so the data can be corrected by the following two methods:
1) transverse correction of abnormal data
The load data is continuous in the time dimension, and the load data of two adjacent time periods generally do not have too great difference, so that abnormal data can be automatically found by comparing whether the data of two days before and after are in the same dimension. Most of the load data is in the region near the fitted line, and only some of the load data points deviate excessively from the fitted line, so according to the following equation
Figure 97295DEST_PATH_IMAGE017
Figure 329693DEST_PATH_IMAGE018
The area range can be directly defined (the calculation coefficient set according to the actual operating requirement). Wherein,
Figure 773444DEST_PATH_IMAGE019
for an abnormal point in the industrial load data set, d represents the current day, d-1 represents the previous day, t represents the current time, t-1 represents the second before the t time, and t +1 represents the second after the t time
Figure 209236DEST_PATH_IMAGE020
Meanwhile, the abnormal data points can be directly positioned, and then the abnormal data is corrected according to the following formula.
Figure 763845DEST_PATH_IMAGE021
2) Longitudinal correction of anomalous data
If the load data with the fine granularity has abnormal mutation, the load data can be corrected by a longitudinal correction method by means of the loads at the same time points of the previous and subsequent days. Wherein the formula of the longitudinal correction is shown as follows
Figure 850750DEST_PATH_IMAGE022
Wherein alpha is 2 And beta 2 To calculate coefficients, where 22 =1,
Figure 980249DEST_PATH_IMAGE023
Is modified to the first
Figure 90287DEST_PATH_IMAGE024
The electrical load at the time of day i,
Figure 996057DEST_PATH_IMAGE025
the electric loads at the same time in two days before and after the abnormal data,
Figure 140731DEST_PATH_IMAGE026
the average value of the power load of each two days before and after the abnormal data is obtained.
Because the load data points adopted in the embodiment are sampled every 15min, the embodiment adopts a longitudinal correction method for abnormal data. In the present embodiment, the data with abnormal values are processed by the data preprocessing method, and the number of abnormal days is 26 days.
In the actual prediction process, the input set of the model usually includes a plurality of quantities with different dimensions, and in order to eliminate the influence of the different dimensions on the prediction result, the data are normalized in advance in step S3 to improve the accuracy and efficiency of the model. Typically, the data is normalized to between [0,1 ]. The normalization formula is as follows:
Figure 988601DEST_PATH_IMAGE027
in the formula,
Figure 835203DEST_PATH_IMAGE028
in order to require a normalized industrial load history data,
Figure 793932DEST_PATH_IMAGE029
represents the normalized industrial load historical data,
Figure 272406DEST_PATH_IMAGE030
and
Figure 291177DEST_PATH_IMAGE031
respectively, the minimum and maximum values in the data.
In step S4, the normalized data is collected for 219 days at 96 points per day
Figure 375808DEST_PATH_IMAGE032
The 96-dimensional load feature vector is smoothed and then passed
Figure 387495DEST_PATH_IMAGE033
Clustering is carried out, and the specific steps are as follows:
smoothing the normalized industrial electric load historical data to obtain an industrial electric load historical data set of a to-be-predicted place of the to-be-predicted place
Figure 241182DEST_PATH_IMAGE002
,x i Any data point in the historical data set representing the industrial electrical load; and construct a corresponding set of metrics
Figure 165275DEST_PATH_IMAGE003
(ii) a Namely a load characteristic index; i =1,2, 3.. N, N representing the number of data points in the industrial electrical load historical data set; by d ij Represents the data point x i And x j The Euclidean distance between the two points is the distance between two points in the industrial electrical load data set; for any data point X in industrial electrical load data set X i Two important parameters are defined: local density and density distance.
The local density is usually calculated using a Cut-off kernel (Cut-off kernel) or a Gaussian kernel (Gaussian kernel), but the Cut-off kernel is a discrete value and the Gaussian kernel is a continuous value. Considering that the original data is continuous in this embodiment, a gaussian kernel function is used to calculate the local density.
Calculating local density rho of each data point in industrial electrical load historical data set based on distance between every two data i
Figure 487935DEST_PATH_IMAGE004
In the formula (d) ij Represents the data point x i And x j Of the Euclidean distance between d c Denotes the truncation distance, p i Representing industrial electrical load data sets
Figure 991728DEST_PATH_IMAGE005
Median data point x i Is less than d c The number of points of (c); for large industrial electrical load data sets, the density peak value clustering algorithm is used for d c Is robust.
Since the Gaussian kernel function is a continuous value, the probability that different data points have the same local density value is small, and a data set is provided
Figure 277085DEST_PATH_IMAGE006
Representing local density sets
Figure 309763DEST_PATH_IMAGE007
In descending order of (a) is ordered,
Figure 431303DEST_PATH_IMAGE006
satisfies the following conditions:
Figure 223941DEST_PATH_IMAGE008
each data point is calculated by the above formula
Figure 52219DEST_PATH_IMAGE009
Is/are as follows
Figure 832962DEST_PATH_IMAGE010
Using a two-dimensional graph to represent all data points
Figure 379481DEST_PATH_IMAGE010
And performing representation to obtain a decision diagram. The principle of selecting the cluster center is that both the rho value and the delta value of the data point are large. And the remaining data points are assigned to the cluster class with the closest data point with higher density after determining the cluster center.
Specifically, the euclidean distance between every two data points in the industrial electrical load historical data set is calculated by the following formula:
Figure 975810DEST_PATH_IMAGE011
in the formula, x ik And x jk For industrial electrical load historical data set x i And x j The kth-dimension element of (1);
distance of truncationd c The selection process comprises the following steps:
and (3) carrying out ascending arrangement on Euclidean distance values between every two data points in the obtained industrial electrical load historical data set by calculation:
Figure 189754DEST_PATH_IMAGE012
(ii) a Taking the truncation distanced c =d n Subscript thereofn=[0.02N],[ ]Is a rounding function.
For the condition that the clustering center is difficult to judge by naked eyes in the decision diagram, an index gamma comprehensively considering the local density and the density distance is defined i
Figure 344660DEST_PATH_IMAGE013
γ i The larger the probability that the data point is the cluster center, so for the index dataset
Figure 581738DEST_PATH_IMAGE014
Arranged in descending order and drawn as gamma i A two-dimensional coordinate graph with a vertical axis and a data point subscript i in the industrial electrical load historical data set as a horizontal axis; gamma corresponding to non-cluster central point i The values are relatively smooth, and the clustering central point and the non-clustering central point correspond to gamma i There is a jump in value that is discernible to the naked eye.
The decision diagram obtained in this embodiment is shown in fig. 2, and there are 4 points having larger sum values. By using the index gamma i After calculation and descending arrangement and drawing, FIG. 3 is obtained, which shows that the 4 points have obvious jump at gamma ≈ 0.13 with other points. Therefore, the number of cluster centers is 4 in total.
The daily load fluctuation reflected by the various clusters in the present embodiment is shown in fig. 4, and since the load data has been normalized to the interval [0,1], the vertical axis scale is [0,1 ].
The four fluctuation situations in fig. 4 basically cover the load fluctuation situation of the cement industry under various production conditions. The class cluster 1 reflects the load condition of production reduction and even production stop, the class clusters 2 and 4 reflect the load characteristic of reducing the power consumption cost by adopting a peak avoidance method under the normal production condition, and the class cluster 3 reflects the load characteristic of an enterprise during all-weather full-load production.
In step S5, in view of the small sample load data prediction oriented in the present embodiment, the GRNN algorithm is an improved radial basis function with stronger nonlinear mapping capability, better fault tolerance, and higher robustness, and still has higher prediction accuracy under the condition of fewer samples. Therefore, the GRNN model structure used in this embodiment has four layers, namely an input layer, an output layer, a mode layer, and an output layer. The input layer and the output layer are respectively provided with 96 neurons.
Because the number of samples of the partial classification clusters is small, the invention adopts a cross validation method to train the neural network, and specifically comprises the following steps:
forming a sample set corresponding to each class cluster based on the data point corresponding to each class cluster;
respectively performing fold-cross validation on the number of samples of each class cluster, and dividing a sample set of a certain class cluster into K sub-sample sets, wherein K is a positive integer;
taking one sub-sample set as a test set in turn, taking the rest K-1 sub-samples as a training set, training the GRNN model of the cluster, and repeating for K times;
in the K training processes of the GRNN model of any one class cluster, circularly selecting the SPREAD value of the GRNN model during each training;
and selecting a GRNN model generated by the training set and the SPREAD value corresponding to the minimum mean square error as the GRNN model of the cluster.
The SPREAD value is an important parameter for adjusting the generalized recurrent neural network, and whether the reasonable selection value of the SPREAD value reasonably and directly influences the accuracy of the prediction result. The larger the value of the SPREAD is, the more the neurons can be guaranteed to correspond to the area covered by the input vector, but if the value of the SPREAD is too large, numerical calculation becomes more difficult, and meanwhile, the too large value of the SPREAD can make the approximation result of the neural network in the data sample smooth, so that the error becomes larger. Therefore, in order to carry out stricter fitting on the data, the optimal SPREAD value is selected by a method of circularly selecting the SPREAD value.
Taking a representative cluster 2 as an example, 48 samples are totally used, 4-fold cross validation is performed, the value range of the SPREAD value is set to [0.1,2], the step length is 0.1, and the mean square error MSE is used as an evaluation index of an output result, and the result is shown in Table 1.
Figure 975898DEST_PATH_IMAGE034
Cross validation as shown in table 1 at cross validation 4, the value of MSE was the smallest at a value of 1.5 for the value of stream. Therefore, for the training set used in the 4 th verification of the class cluster 2, the SPREAD value is 1.5, and the constructed GRNN algorithm has the best prediction effect. And when the GRNN neural network prediction model is constructed for other 3 clusters, the optimal training set and the optimal SPREAD value are selected according to the method.
In step S6, the same method steps as those in step S1-3 are first adopted to perform preprocessing, abnormal value correction, and normalization on the current real-time industrial electrical load data of the location to be predicted, so as to obtain the current real-time industrial electrical load data set of the location to be predicted.
And then, calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the site to be predicted by adopting the same method steps as the step S5 and adopting a cross validation method, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model.
In step S7, after the prediction model outputs the prediction result, it is necessary to perform inverse normalization:
Figure 44348DEST_PATH_IMAGE035
in the formula,
Figure 370156DEST_PATH_IMAGE036
in order to be a normalized load prediction value,
Figure 891267DEST_PATH_IMAGE037
the predicted value of the actual power load is obtained after inverse normalization.
In the present embodiment, Mean Square Error (MSE) and Mean Square Error (MSE) are used to evaluate the prediction accuracy, and the prediction effect evaluation is shown in table 2.
Figure 829399DEST_PATH_IMAGE038
Therefore, the prediction precision of the method meets the requirement of practical application.
Those not described in detail in this specification are within the skill of the art.

Claims (10)

1. A DPC-GRNN-based ultra-short-term industrial electrical load prediction method is characterized by comprising the following steps:
preprocessing industrial electric load historical data of a place to be predicted to form an industrial electric load historical data set,
correcting abnormal values in the industrial electrical load historical data set;
carrying out normalization processing on the corrected industrial power load historical data set of the site to be predicted;
DPC clustering analysis is carried out on the normalized industrial power load historical data set of the site to be predicted, and a corresponding cluster is obtained;
respectively constructing a GRNN model for each class cluster, and training the corresponding GRNN model by using the industrial electrical load historical data set corresponding to each class cluster as a training set;
calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the place to be predicted, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model;
inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing inverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted.
2. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: the industrial electrical load historical data of the site to be predicted is a time sequence of the industrial electrical load collected according to a set time period; the current real-time industrial electrical load data of the site to be predicted is a time sequence containing the current industrial electrical load data.
3. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: preprocessing current real-time industrial electrical load data of a place to be predicted to form an industrial electrical load data set, and correcting abnormal values in the industrial electrical load data set; and inputting the corrected industrial electrical load data set into a prediction model.
4. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: and calculating to obtain an industrial electrical load historical data set by adopting the following formula:
Figure 340474DEST_PATH_IMAGE001
wherein s represents the total number of peaks and troughs in the load sequence of the industrial electrical load history, and t i Representing time sequences corresponding to the wave crest and the wave trough respectively, wherein i represents the number of data points in the industrial electrical load historical data set, and i =1,2, 3.. N, and N represents the number of the data points in the industrial electrical load historical data set; wherein 1 ≦ i ≦ s.
5. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: the process for correcting the abnormal value of the industrial electrical load historical data set of the site to be predicted comprises the following steps:
repairing and filling the industrial electrical load historical data set of the site to be predicted in a curve fitting mode;
automatically finding abnormal data and performing transverse correction by comparing whether the historical data of the industrial electric load in the two time periods before and after are in the same dimension;
and correcting abnormal mutation of the historical data of the industrial electric load with fine granularity by a longitudinal correction method based on the same time point data of the previous time period and the next time period.
6. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein:
DPC cluster analysis is carried out on the load characteristic vector, and the process of obtaining the corresponding cluster comprises the following steps: smoothing the normalized industrial electrical load historical data to obtain the normalized industrial electrical load historical data;
calculating the local density of each data point in the industrial electrical load historical data set based on the Euclidean distance between every two data points;
performing descending order arrangement on the obtained local density of each data point, and forming a sequence number set based on the sequence number distribution of the local density in the sequence;
calculating the minimum value of the Euclidean distance between each data point in the sequence number set and other data points as the density distance of the data points in the industrial electrical load historical data set corresponding to the sequence number;
plotting a decision graph based on the local density and density distance of each data point in the historical data set of the industrial electrical load;
selecting a data point which is positioned at the upper right of the decision graph and is different from the corresponding data point of other points as a clustering center;
the remaining data points are assigned to the cluster of classes in which the closest and locally higher density data points are located.
7. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 6, wherein: the process of training the corresponding GRNN model by using the industrial electrical load historical data corresponding to each class cluster as a training set comprises the following steps:
forming a sample set corresponding to each class cluster based on the data point corresponding to each class cluster;
respectively performing fold-cross validation on the number of samples of each class cluster, and dividing a sample set of a certain class cluster into K sub-sample sets, wherein K is a positive integer;
taking one sub-sample set as a test set in turn, taking the rest K-1 sub-samples as a training set, training the GRNN model of the cluster, and repeating for K times;
in the K training processes of the GRNN model of any one class cluster, circularly selecting the SPREAD value of the GRNN model during each training;
and selecting a GRNN model generated by the training set and the SPREAD value corresponding to the minimum mean square error as the GRNN model of the cluster.
8. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 6, wherein:
defining historical data set of industrial electrical load of site to be predicted
Figure 631909DEST_PATH_IMAGE002
,x i Any data point in the historical data set representing the industrial electrical load; and construct a corresponding set of metrics
Figure 42424DEST_PATH_IMAGE003
(ii) a i =1,2,3,.. N, N represents the number of data points in the industrial electrical load historical data set;
calculating local density rho of each data point in industrial electrical load historical data set based on distance between every two data i
Figure 647763DEST_PATH_IMAGE004
In the formula (d) ij Represents the data point x i And x j Of the Euclidean distance between d c Denotes the truncation distance, p i Representing industrial electrical load data set
Figure 924155DEST_PATH_IMAGE005
Median data point x i Is less than d c The number of points of (a);
setting a data set
Figure 19281DEST_PATH_IMAGE006
Representing local density sets
Figure 313996DEST_PATH_IMAGE007
In descending order of (a) is ordered,
Figure 101954DEST_PATH_IMAGE006
satisfies the following conditions:
Figure 131221DEST_PATH_IMAGE008
each data point is calculated by the above formula
Figure 780771DEST_PATH_IMAGE009
Is/are as follows
Figure 680725DEST_PATH_IMAGE010
Using a two-dimensional graph to represent all data points
Figure 639585DEST_PATH_IMAGE010
And performing representation to obtain a decision diagram.
9. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 8, wherein: the Euclidean distance between every two data points in the industrial electrical load historical data set is calculated by adopting the following formula:
Figure 347691DEST_PATH_IMAGE011
in the formula, x ik And x jk For industrial electrical load historical data set x i And x j The kth-dimension element of (1);
distance of truncationd c The selection process comprises the following steps:
and (3) carrying out ascending arrangement on Euclidean distance values between every two data points in the obtained industrial electrical load historical data set by calculation:
Figure 565046DEST_PATH_IMAGE012
(ii) a Taking the truncation distanced c =d n Subscript thereofn=[0.02N],[ ]Is a rounding function.
10. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 8, wherein: for the condition that the clustering center is difficult to judge by naked eyes in the decision diagram, an index gamma comprehensively considering the local density and the density distance is defined i
Figure 53927DEST_PATH_IMAGE013
For the index data set
Figure 183689DEST_PATH_IMAGE014
Arranged in descending order and drawn as gamma i A two-dimensional coordinate graph with a vertical axis and a data point subscript i in the industrial electrical load historical data set as a horizontal axis; gamma corresponding to non-cluster central point i The values are relatively smooth, and the clustering central point and the non-clustering central point correspond to gamma i There is a jump in value.
CN202210795337.0A 2022-07-07 2022-07-07 DPC-GRNN-based ultra-short-term industrial electrical load prediction method Pending CN114881372A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210795337.0A CN114881372A (en) 2022-07-07 2022-07-07 DPC-GRNN-based ultra-short-term industrial electrical load prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210795337.0A CN114881372A (en) 2022-07-07 2022-07-07 DPC-GRNN-based ultra-short-term industrial electrical load prediction method

Publications (1)

Publication Number Publication Date
CN114881372A true CN114881372A (en) 2022-08-09

Family

ID=82683387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210795337.0A Pending CN114881372A (en) 2022-07-07 2022-07-07 DPC-GRNN-based ultra-short-term industrial electrical load prediction method

Country Status (1)

Country Link
CN (1) CN114881372A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598151A (en) * 2020-05-12 2020-08-28 辽宁工程技术大学 Method for predicting user electricity load
CN114580968A (en) * 2022-03-29 2022-06-03 广东电网有限责任公司 Power utilization management method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598151A (en) * 2020-05-12 2020-08-28 辽宁工程技术大学 Method for predicting user electricity load
CN114580968A (en) * 2022-03-29 2022-06-03 广东电网有限责任公司 Power utilization management method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李钢等: "《基于改进密度峰值聚类的超短期工业负荷预测》", 《电测与仪表》 *

Similar Documents

Publication Publication Date Title
CN111783953B (en) 24-point power load value 7-day prediction method based on optimized LSTM network
CN108564204B (en) Least square support vector machine electricity quantity prediction method based on maximum correlation entropy criterion
CN112734128B (en) 7-day power load peak prediction method based on optimized RBF
CN101863088B (en) Method for forecasting Mooney viscosity in rubber mixing process
CN106055918A (en) Power system load data identification and recovery method
CN111369070A (en) Envelope clustering-based multimode fusion photovoltaic power prediction method
CN112990500A (en) Transformer area line loss analysis method and system based on improved weighted gray correlation analysis
CN112149879A (en) New energy medium-and-long-term electric quantity prediction method considering macroscopic volatility classification
CN109840633B (en) Photovoltaic output power prediction method, system and storage medium
CN112016755A (en) Construction method of universal design cost standardization technology module of power transmission and transformation project construction drawing
CN112801388B (en) Power load prediction method and system based on nonlinear time series algorithm
CN112365056A (en) Electrical load joint prediction method and device, terminal and storage medium
CN110909958A (en) Short-term load prediction method considering photovoltaic grid-connected power
CN113536694B (en) Robust optimization operation method, system and device for comprehensive energy system and storage medium
CN116227637A (en) Active power distribution network oriented refined load prediction method and system
CN113627735A (en) Early warning method and system for safety risk of engineering construction project
CN113326654A (en) Method and device for constructing gas load prediction model
CN105303466A (en) Intelligent power grid engineering project comprehensive evaluation method based on AHP-GRA
CN110866658A (en) Method for predicting medium and long term load of urban power grid
CN115905319B (en) Automatic identification method and system for abnormal electricity fees of massive users
CN114881374B (en) Multi-element heterogeneous energy consumption data fusion method and system for building energy consumption prediction
CN112949207A (en) Short-term load prediction method based on improved least square support vector machine
CN114154716B (en) Enterprise energy consumption prediction method and device based on graph neural network
CN111311026A (en) Runoff nonlinear prediction method considering data characteristics, model and correction
CN110991747A (en) Short-term load prediction method considering wind power plant power

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220809

RJ01 Rejection of invention patent application after publication