CN114881372A - DPC-GRNN-based ultra-short-term industrial electrical load prediction method - Google Patents
DPC-GRNN-based ultra-short-term industrial electrical load prediction method Download PDFInfo
- Publication number
- CN114881372A CN114881372A CN202210795337.0A CN202210795337A CN114881372A CN 114881372 A CN114881372 A CN 114881372A CN 202210795337 A CN202210795337 A CN 202210795337A CN 114881372 A CN114881372 A CN 114881372A
- Authority
- CN
- China
- Prior art keywords
- electrical load
- industrial electrical
- data set
- historical data
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000002159 abnormal effect Effects 0.000 claims abstract description 29
- 238000010606 normalization Methods 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000004458 analytical method Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 15
- 238000012937 correction Methods 0.000 claims description 12
- 238000010586 diagram Methods 0.000 claims description 11
- 238000002790 cross-validation Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000007621 cluster analysis Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 230000035772 mutation Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 2
- 238000004422 calculation algorithm Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 230000005611 electricity Effects 0.000 description 5
- 238000005265 energy consumption Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 239000004568 cement Substances 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- General Business, Economics & Management (AREA)
- General Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Educational Administration (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Game Theory and Decision Science (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a DPC-GRNN-based ultra-short-term industrial electric load prediction method, which comprises the following steps: preprocessing industrial electrical load historical data of a place to be predicted to form an industrial electrical load historical data set, and correcting abnormal values in the industrial electrical load historical data set; carrying out normalization processing on the corrected industrial power load historical data set of the site to be predicted; DPC clustering analysis is carried out on the normalized industrial power load historical data set of the site to be predicted, and a corresponding cluster is obtained; respectively constructing a GRNN model for each cluster; calculating the SPREAD value corresponding to each GRNN model, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model; and inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing inverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted. The invention effectively provides higher prediction precision of the ultra-short-term industrial electrical load.
Description
Technical Field
The invention belongs to the technical field of industrial electricity, and particularly relates to a DPC-GRNN-based ultra-short-term industrial electricity load prediction algorithm.
Background
In an effort to solve the outstanding contradiction and deep-level problems of the power industry, structural transformation and industry upgrading are pushed. Therefore, the method has important significance in mining the power utilization data and power utilization behaviors of the users, mastering the power utilization rules of the users and carrying out accurate load prediction.
The electric power high energy consumption industry, its total load is big, and volatility is strong, has certain impact load, has great influence to electric power system, threatens electric power system safety and stability and electric energy quality. Therefore, the factors influencing the industrial power load with high energy consumption are scientifically analyzed, the purpose of improving the accuracy of load prediction is achieved, the production mode is adjusted according to the purpose, and the stable operation of a power system is guaranteed.
Currently, the load prediction for high-energy-consumption industrial users mainly focuses on medium-and long-term load prediction, and the main methods and the disadvantages thereof include:
(1) constructing a load prediction model by using a classified modeling idea aiming at the load fluctuation characteristics of high-energy-consumption industrial users; but its classification principle relies on subjective judgment.
(2) An FCM clustering method is used for researching the load characteristics of industrial users; however, the FCM clustering algorithm is prone to fall into a local saddle point, so that the prediction accuracy is not high.
(3) Establishing an industrial electrical load prediction model by a genetic membrane optimization BP neural network; the prediction model has large subjectivity and low prediction precision.
Disclosure of Invention
The invention aims to solve the defects of the background technology, provides the DPC-GRNN-based ultra-short-term industrial electrical load prediction algorithm, performs clustering analysis on load data by using the DPC algorithm with better clustering effect, and then respectively establishes prediction models for clusters obtained by clustering, so that the prediction precision is higher.
The technical scheme adopted by the invention is as follows: a DPC-GRNN-based ultra-short-term industrial electrical load prediction method comprises the following steps:
preprocessing industrial electric load historical data of a place to be predicted to form an industrial electric load historical data set,
correcting abnormal values in the historical data set of the industrial electrical load;
carrying out normalization processing on the corrected industrial power load historical data set of the site to be predicted;
DPC clustering analysis is carried out on the normalized industrial power load historical data set of the site to be predicted, and a corresponding cluster is obtained;
respectively constructing a GRNN model for each class cluster, and training the corresponding GRNN model by using the industrial electrical load historical data set corresponding to each class cluster as a training set;
calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the place to be predicted, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model;
and inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing inverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted.
In the technical scheme, the historical data of the industrial electric load of the place to be predicted is a time sequence of the industrial electric load acquired according to a set time period; the current real-time industrial electrical load data of the site to be predicted is a time sequence containing the current industrial electrical load data.
In the technical scheme, the current real-time industrial electrical load data of the site to be predicted are preprocessed to form an industrial electrical load data set, and abnormal values in the industrial electrical load data set are corrected; and inputting the corrected industrial electrical load data set into a prediction model.
In the technical scheme, the industrial electrical load historical data set is obtained by adopting the following formula:
wherein s represents the total number of peaks and troughs in the load sequence of the industrial electrical load history, and t i Representing time sequences corresponding to the wave crest and the wave trough respectively, wherein i represents the number of data points in the historical data set of the industrial electrical load, i =1,2,3,.. N, and N represents the number of the data points in the historical data set of the industrial electrical load; wherein 1 ≦ i ≦ s.
In the above technical solution, the process of correcting the abnormal value of the industrial electrical load historical data set of the location to be predicted includes:
repairing and filling the industrial electrical load historical data set of the site to be predicted in a curve fitting mode;
automatically finding abnormal data and performing transverse correction by comparing whether the historical data of the industrial electric load in the two time periods before and after are in the same dimension;
and correcting abnormal mutation of the historical data of the industrial electric load with fine granularity by a longitudinal correction method based on the same time point data of the previous time period and the next time period.
In the above technical solution, the process of performing DPC cluster analysis on the load feature vector to obtain a corresponding class cluster includes: smoothing the normalized industrial electrical load historical data to obtain the normalized industrial electrical load historical data;
calculating the local density of each data point in the industrial electrical load historical data set based on the Euclidean distance between every two data points;
performing descending order arrangement on the obtained local density of each data point, and forming a sequence number set based on the sequence number distribution of the local density in the sequence;
calculating the minimum value of the Euclidean distance between each data point in the sequence number set and other data points as the density distance of the data points in the industrial electrical load historical data set corresponding to the sequence number;
plotting a decision graph based on the local density and density distance of each data point in the historical data set of the industrial electrical load;
selecting a data point which is positioned at the upper right of the decision graph and is different from the corresponding data point of other points as a clustering center;
the remaining data points are assigned to the cluster of classes in which the closest and locally higher density data points are located.
In the above technical solution, the process of training the corresponding GRNN model by using the industrial electrical load historical data corresponding to each class cluster as a training set includes:
forming a sample set corresponding to each class cluster based on the data point corresponding to each class cluster;
respectively performing fold-cross validation on the number of samples of each class cluster, and dividing a sample set of a certain class cluster into K sub-sample sets, wherein K is a positive integer;
taking one sub-sample set as a test set in turn, taking the rest K-1 sub-samples as a training set, training the GRNN model of the cluster, and repeating for K times;
in the K training processes of the GRNN model of any one class cluster, circularly selecting the SPREAD value of the GRNN model during each training;
and selecting a GRNN model generated by the training set and the SPREAD value corresponding to the minimum mean square error as the GRNN model of the cluster.
In the technical scheme, the normalized industrial electrical load historical data is smoothed to obtain the industrial electrical load historical data set of the site to be predicted,x i Any data point in the historical data set representing the industrial electrical load; and construct a corresponding set of metrics(ii) a N, N denotes an industrial electrical load historical data setThe number of data points in;
calculating local density rho of each data point in industrial electrical load historical data set based on distance between every two data i :
In the formula (d) ij Represents the data point x i And x j Of the Euclidean distance between d c Denotes the truncation distance, p i Representing industrial electrical load data setsMedian data point x i Is less than d c The number of points of (a);
setting a data setRepresenting local density setsIn descending order of (a) is ordered,satisfies the following conditions:
Using a two-dimensional graph to represent all data pointsAnd performing representation to obtain a decision diagram.
In the above technical solution, the euclidean distance between every two data points in the industrial electrical load historical data set is calculated by using the following formula:
in the formula, x ik And x jk For industrial electrical load historical data set x i And x j The kth-dimension element of (1);
distance of truncationd c The selection process comprises the following steps:
and (3) carrying out ascending arrangement on Euclidean distance values between every two data points in the obtained industrial electrical load historical data set by calculation:(ii) a Taking the truncation distanced c =d n Subscript thereofn=[0.02N],[ ]Is a rounding function.
In the technical scheme, for the condition that the clustering center is difficult to judge by naked eyes in the decision diagram, an index gamma comprehensively considering the local density and the density distance is defined i
For the index data setArranged in descending order and drawn as gamma i A two-dimensional coordinate graph with a vertical axis and a data point subscript i in the industrial electrical load historical data set as a horizontal axis; gamma corresponding to non-cluster central point i The values are relatively smooth, and the clustering central points and the gamma corresponding to the non-clustering central points i There is a jump in value.
The invention provides a computer-readable storage medium, wherein a DPC-GRNN-based ultra-short-term industrial electric load prediction method program is stored on the computer-readable storage medium, and when being executed by a processor, the DPC-GRNN-based ultra-short-term industrial electric load prediction method program realizes the steps of the DPC-GRNN-based ultra-short-term industrial electric load prediction method according to the technical scheme.
The invention has the beneficial effects that: the invention provides a DPC-GRNN-based ultra-short-term industrial electricity load prediction algorithm which can provide important basis and reference for large-scale users to purchase electricity. In view of the fact that traditional clustering is easy to enter local saddle points and depends on initialization data, the DPC algorithm adopted by the invention has the advantages of fast convergence, high robustness, no need of manually setting the optimal clustering number and the like, can more accurately cluster original load data, does not need to manually appoint a clustering center and the clustering number, has better applicability in the aspect of clustering the original data before load prediction of a large user, can automatically determine the clustering center and the clustering number, quickly searches and finds a density peak value of a data point, can obtain more accurate clusters, and effectively analyzes power utilization behaviors of the user. Based on the load data analysis result of the DPC algorithm, the invention constructs a GRNN prediction model for each cluster to carry out load prediction, and the model has higher prediction precision. When the GRNN model is constructed, according to the difference of the sample numbers of different clusters, a K-fold cross validation training model is selected, the SPREAD value is selected in a circulating mode, and then the GRNN neural network is constructed according to the optimal value. The prediction precision is higher, can better instruct the user to purchase the electricity rationally. The invention adopts the preprocessed data set before and after constructing the model and uses the model prediction to ensure the periodicity of the load sequence. The invention analyzes and corrects the abnormal value of the data before and after the model is constructed and when the model is used for prediction, thereby further strengthening the prediction precision of the invention.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a decision diagram of the present embodiment;
FIG. 3 is a diagram illustrating descending order of the decision graph according to the present embodiment;
fig. 4 is a schematic diagram of a cluster fluctuation situation in this embodiment.
Detailed Description
The invention will be further described in detail with reference to the following drawings and specific examples, which are not intended to limit the invention, but are for clear understanding.
As shown in fig. 1, the present invention provides a DPC-GRNN-based ultra-short-term industrial electrical load prediction method, which includes the following steps:
s1, preprocessing the industrial electric load historical data of the site to be predicted to form an industrial electric load historical data set,
s2, correcting abnormal values in the industrial electrical load historical data set;
s3, carrying out normalization processing on the corrected historical data set of the industrial power load of the site to be predicted;
s4, carrying out DPC cluster analysis on the normalized industrial power load historical data set of the site to be predicted to obtain a corresponding cluster;
s5, respectively constructing GRNN models for each class cluster, and training the corresponding GRNN models by taking the industrial electrical load historical data set corresponding to each class cluster as a training set;
s6, calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the place to be predicted, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model;
and S7, inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing reverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted.
The ultra-short-term industrial power load prediction method meets the ultra-short-term load prediction requirements of GB/T31464-:
a) and predicting the next 5min or 10min or 15min of the current moment.
b) On the basis of real-time power utilization load, the ultra-short-term load prediction is completed by combining date types such as working days and rest days and the characteristics of historical load.
The load data source adopted by the specific embodiment of the invention is data collected by a cement company in a certain city through a gateway table. The time span is from 5/1/2018 to 31/2018/12/245/day, data collection is performed every 15 minutes for 96 points/day.
In the process of predicting the short-term load, the power load has a certain periodic characteristic and is also influenced by other external influence factors, so that the reasonable input set is very necessary to be determined before the prediction model is built.
In order to determine a reasonable input set X for the subsequent prediction model, in step S1, an industrial electrical load historical data set is obtained by using the following calculation:
wherein s represents the total number of peaks and troughs of the load sequence in the historical data of the industrial electrical load, and t i Representing time sequences corresponding to the wave crest and the wave trough respectively, wherein i represents the number of data points in the industrial electrical load historical data set, and i =1,2, 3.. N, and N represents the number of the data points in the industrial electrical load historical data set; wherein 1 ≦ i ≦ s.
The situation of abnormal data in the real data set is inevitable, and in the work of load prediction, the abnormal data can be divided into two types: data misses and data errors. In step S2, the following method is used to correct the abnormal data.
The loss of the industrial electrical load data is generally caused by the loss of the working data records, and if the front span and the rear span of the lost data are not large, the data can be repaired and filled in by using a curve fitting mode. Wherein the definition of the curve fitting is assumed to be:
f represents an abstract function of curve fitting; wherein y represents the historical data set of the industrial electrical load to be corrected, a 1 --a n Representing the coefficient to be solved, g representing a function for solving the missing data; let g 1 (y)=1,g 2 (y) = y, and so on. Will be short ofLoad data at the moment before the moment of data loss and load data at the moment after the moment are substituted, and the load data can be solved by means of the principle of least square method and extreme value conceptAnd a function of the fitting curve of the industrial electrical load data is obtained. The present embodiment can determine a correction value for the missing industrial electrical load data by this function.
Data errors are very common in the context of power load prediction, and generally represent sudden changes in load changes at a certain time. The sudden change of the data causes unnecessary noise pollution to the subsequent model learning load change rule, so that the abnormal data needs to be correspondingly checked and processed. In the face of inspection and correction of such data, the cost of manual inspection process is very high, so the data can be corrected by the following two methods:
1) transverse correction of abnormal data
The load data is continuous in the time dimension, and the load data of two adjacent time periods generally do not have too great difference, so that abnormal data can be automatically found by comparing whether the data of two days before and after are in the same dimension. Most of the load data is in the region near the fitted line, and only some of the load data points deviate excessively from the fitted line, so according to the following equation,The area range can be directly defined (the calculation coefficient set according to the actual operating requirement). Wherein,for an abnormal point in the industrial load data set, d represents the current day, d-1 represents the previous day, t represents the current time, t-1 represents the second before the t time, and t +1 represents the second after the t time
Meanwhile, the abnormal data points can be directly positioned, and then the abnormal data is corrected according to the following formula.
2) Longitudinal correction of anomalous data
If the load data with the fine granularity has abnormal mutation, the load data can be corrected by a longitudinal correction method by means of the loads at the same time points of the previous and subsequent days. Wherein the formula of the longitudinal correction is shown as follows
Wherein alpha is 2 And beta 2 To calculate coefficients, where 2 +β 2 =1,Is modified to the firstThe electrical load at the time of day i,the electric loads at the same time in two days before and after the abnormal data,the average value of the power load of each two days before and after the abnormal data is obtained.
Because the load data points adopted in the embodiment are sampled every 15min, the embodiment adopts a longitudinal correction method for abnormal data. In the present embodiment, the data with abnormal values are processed by the data preprocessing method, and the number of abnormal days is 26 days.
In the actual prediction process, the input set of the model usually includes a plurality of quantities with different dimensions, and in order to eliminate the influence of the different dimensions on the prediction result, the data are normalized in advance in step S3 to improve the accuracy and efficiency of the model. Typically, the data is normalized to between [0,1 ]. The normalization formula is as follows:
in the formula,in order to require a normalized industrial load history data,represents the normalized industrial load historical data,andrespectively, the minimum and maximum values in the data.
In step S4, the normalized data is collected for 219 days at 96 points per dayThe 96-dimensional load feature vector is smoothed and then passedClustering is carried out, and the specific steps are as follows:
smoothing the normalized industrial electric load historical data to obtain an industrial electric load historical data set of a to-be-predicted place of the to-be-predicted place,x i Any data point in the historical data set representing the industrial electrical load; and construct a corresponding set of metrics(ii) a Namely a load characteristic index; i =1,2, 3.. N, N representing the number of data points in the industrial electrical load historical data set; by d ij Represents the data point x i And x j The Euclidean distance between the two points is the distance between two points in the industrial electrical load data set; for any data point X in industrial electrical load data set X i Two important parameters are defined: local density and density distance.
The local density is usually calculated using a Cut-off kernel (Cut-off kernel) or a Gaussian kernel (Gaussian kernel), but the Cut-off kernel is a discrete value and the Gaussian kernel is a continuous value. Considering that the original data is continuous in this embodiment, a gaussian kernel function is used to calculate the local density.
Calculating local density rho of each data point in industrial electrical load historical data set based on distance between every two data i :
In the formula (d) ij Represents the data point x i And x j Of the Euclidean distance between d c Denotes the truncation distance, p i Representing industrial electrical load data setsMedian data point x i Is less than d c The number of points of (c); for large industrial electrical load data sets, the density peak value clustering algorithm is used for d c Is robust.
Since the Gaussian kernel function is a continuous value, the probability that different data points have the same local density value is small, and a data set is providedRepresenting local density setsIn descending order of (a) is ordered,satisfies the following conditions:
Using a two-dimensional graph to represent all data pointsAnd performing representation to obtain a decision diagram. The principle of selecting the cluster center is that both the rho value and the delta value of the data point are large. And the remaining data points are assigned to the cluster class with the closest data point with higher density after determining the cluster center.
Specifically, the euclidean distance between every two data points in the industrial electrical load historical data set is calculated by the following formula:
in the formula, x ik And x jk For industrial electrical load historical data set x i And x j The kth-dimension element of (1);
distance of truncationd c The selection process comprises the following steps:
and (3) carrying out ascending arrangement on Euclidean distance values between every two data points in the obtained industrial electrical load historical data set by calculation:(ii) a Taking the truncation distanced c =d n Subscript thereofn=[0.02N],[ ]Is a rounding function.
For the condition that the clustering center is difficult to judge by naked eyes in the decision diagram, an index gamma comprehensively considering the local density and the density distance is defined i
γ i The larger the probability that the data point is the cluster center, so for the index datasetArranged in descending order and drawn as gamma i A two-dimensional coordinate graph with a vertical axis and a data point subscript i in the industrial electrical load historical data set as a horizontal axis; gamma corresponding to non-cluster central point i The values are relatively smooth, and the clustering central point and the non-clustering central point correspond to gamma i There is a jump in value that is discernible to the naked eye.
The decision diagram obtained in this embodiment is shown in fig. 2, and there are 4 points having larger sum values. By using the index gamma i After calculation and descending arrangement and drawing, FIG. 3 is obtained, which shows that the 4 points have obvious jump at gamma ≈ 0.13 with other points. Therefore, the number of cluster centers is 4 in total.
The daily load fluctuation reflected by the various clusters in the present embodiment is shown in fig. 4, and since the load data has been normalized to the interval [0,1], the vertical axis scale is [0,1 ].
The four fluctuation situations in fig. 4 basically cover the load fluctuation situation of the cement industry under various production conditions. The class cluster 1 reflects the load condition of production reduction and even production stop, the class clusters 2 and 4 reflect the load characteristic of reducing the power consumption cost by adopting a peak avoidance method under the normal production condition, and the class cluster 3 reflects the load characteristic of an enterprise during all-weather full-load production.
In step S5, in view of the small sample load data prediction oriented in the present embodiment, the GRNN algorithm is an improved radial basis function with stronger nonlinear mapping capability, better fault tolerance, and higher robustness, and still has higher prediction accuracy under the condition of fewer samples. Therefore, the GRNN model structure used in this embodiment has four layers, namely an input layer, an output layer, a mode layer, and an output layer. The input layer and the output layer are respectively provided with 96 neurons.
Because the number of samples of the partial classification clusters is small, the invention adopts a cross validation method to train the neural network, and specifically comprises the following steps:
forming a sample set corresponding to each class cluster based on the data point corresponding to each class cluster;
respectively performing fold-cross validation on the number of samples of each class cluster, and dividing a sample set of a certain class cluster into K sub-sample sets, wherein K is a positive integer;
taking one sub-sample set as a test set in turn, taking the rest K-1 sub-samples as a training set, training the GRNN model of the cluster, and repeating for K times;
in the K training processes of the GRNN model of any one class cluster, circularly selecting the SPREAD value of the GRNN model during each training;
and selecting a GRNN model generated by the training set and the SPREAD value corresponding to the minimum mean square error as the GRNN model of the cluster.
The SPREAD value is an important parameter for adjusting the generalized recurrent neural network, and whether the reasonable selection value of the SPREAD value reasonably and directly influences the accuracy of the prediction result. The larger the value of the SPREAD is, the more the neurons can be guaranteed to correspond to the area covered by the input vector, but if the value of the SPREAD is too large, numerical calculation becomes more difficult, and meanwhile, the too large value of the SPREAD can make the approximation result of the neural network in the data sample smooth, so that the error becomes larger. Therefore, in order to carry out stricter fitting on the data, the optimal SPREAD value is selected by a method of circularly selecting the SPREAD value.
Taking a representative cluster 2 as an example, 48 samples are totally used, 4-fold cross validation is performed, the value range of the SPREAD value is set to [0.1,2], the step length is 0.1, and the mean square error MSE is used as an evaluation index of an output result, and the result is shown in Table 1.
Cross validation as shown in table 1 at cross validation 4, the value of MSE was the smallest at a value of 1.5 for the value of stream. Therefore, for the training set used in the 4 th verification of the class cluster 2, the SPREAD value is 1.5, and the constructed GRNN algorithm has the best prediction effect. And when the GRNN neural network prediction model is constructed for other 3 clusters, the optimal training set and the optimal SPREAD value are selected according to the method.
In step S6, the same method steps as those in step S1-3 are first adopted to perform preprocessing, abnormal value correction, and normalization on the current real-time industrial electrical load data of the location to be predicted, so as to obtain the current real-time industrial electrical load data set of the location to be predicted.
And then, calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the site to be predicted by adopting the same method steps as the step S5 and adopting a cross validation method, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model.
In step S7, after the prediction model outputs the prediction result, it is necessary to perform inverse normalization:
in the formula,in order to be a normalized load prediction value,the predicted value of the actual power load is obtained after inverse normalization.
In the present embodiment, Mean Square Error (MSE) and Mean Square Error (MSE) are used to evaluate the prediction accuracy, and the prediction effect evaluation is shown in table 2.
Therefore, the prediction precision of the method meets the requirement of practical application.
Those not described in detail in this specification are within the skill of the art.
Claims (10)
1. A DPC-GRNN-based ultra-short-term industrial electrical load prediction method is characterized by comprising the following steps:
preprocessing industrial electric load historical data of a place to be predicted to form an industrial electric load historical data set,
correcting abnormal values in the industrial electrical load historical data set;
carrying out normalization processing on the corrected industrial power load historical data set of the site to be predicted;
DPC clustering analysis is carried out on the normalized industrial power load historical data set of the site to be predicted, and a corresponding cluster is obtained;
respectively constructing a GRNN model for each class cluster, and training the corresponding GRNN model by using the industrial electrical load historical data set corresponding to each class cluster as a training set;
calculating the SPREAD value corresponding to each GRNN model based on the current real-time industrial electrical load data set of the place to be predicted, and selecting the GRNN model corresponding to the optimal SPREAD value as a prediction model;
inputting the current real-time industrial electrical load data set of the site to be predicted into the prediction model, and performing inverse normalization processing on the output of the prediction model to obtain the industrial electrical load prediction data set of the site to be predicted.
2. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: the industrial electrical load historical data of the site to be predicted is a time sequence of the industrial electrical load collected according to a set time period; the current real-time industrial electrical load data of the site to be predicted is a time sequence containing the current industrial electrical load data.
3. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: preprocessing current real-time industrial electrical load data of a place to be predicted to form an industrial electrical load data set, and correcting abnormal values in the industrial electrical load data set; and inputting the corrected industrial electrical load data set into a prediction model.
4. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: and calculating to obtain an industrial electrical load historical data set by adopting the following formula:
wherein s represents the total number of peaks and troughs in the load sequence of the industrial electrical load history, and t i Representing time sequences corresponding to the wave crest and the wave trough respectively, wherein i represents the number of data points in the industrial electrical load historical data set, and i =1,2, 3.. N, and N represents the number of the data points in the industrial electrical load historical data set; wherein 1 ≦ i ≦ s.
5. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein: the process for correcting the abnormal value of the industrial electrical load historical data set of the site to be predicted comprises the following steps:
repairing and filling the industrial electrical load historical data set of the site to be predicted in a curve fitting mode;
automatically finding abnormal data and performing transverse correction by comparing whether the historical data of the industrial electric load in the two time periods before and after are in the same dimension;
and correcting abnormal mutation of the historical data of the industrial electric load with fine granularity by a longitudinal correction method based on the same time point data of the previous time period and the next time period.
6. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 1, wherein:
DPC cluster analysis is carried out on the load characteristic vector, and the process of obtaining the corresponding cluster comprises the following steps: smoothing the normalized industrial electrical load historical data to obtain the normalized industrial electrical load historical data;
calculating the local density of each data point in the industrial electrical load historical data set based on the Euclidean distance between every two data points;
performing descending order arrangement on the obtained local density of each data point, and forming a sequence number set based on the sequence number distribution of the local density in the sequence;
calculating the minimum value of the Euclidean distance between each data point in the sequence number set and other data points as the density distance of the data points in the industrial electrical load historical data set corresponding to the sequence number;
plotting a decision graph based on the local density and density distance of each data point in the historical data set of the industrial electrical load;
selecting a data point which is positioned at the upper right of the decision graph and is different from the corresponding data point of other points as a clustering center;
the remaining data points are assigned to the cluster of classes in which the closest and locally higher density data points are located.
7. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 6, wherein: the process of training the corresponding GRNN model by using the industrial electrical load historical data corresponding to each class cluster as a training set comprises the following steps:
forming a sample set corresponding to each class cluster based on the data point corresponding to each class cluster;
respectively performing fold-cross validation on the number of samples of each class cluster, and dividing a sample set of a certain class cluster into K sub-sample sets, wherein K is a positive integer;
taking one sub-sample set as a test set in turn, taking the rest K-1 sub-samples as a training set, training the GRNN model of the cluster, and repeating for K times;
in the K training processes of the GRNN model of any one class cluster, circularly selecting the SPREAD value of the GRNN model during each training;
and selecting a GRNN model generated by the training set and the SPREAD value corresponding to the minimum mean square error as the GRNN model of the cluster.
8. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 6, wherein:
defining historical data set of industrial electrical load of site to be predicted,x i Any data point in the historical data set representing the industrial electrical load; and construct a corresponding set of metrics(ii) a i =1,2,3,.. N, N represents the number of data points in the industrial electrical load historical data set;
calculating local density rho of each data point in industrial electrical load historical data set based on distance between every two data i :
In the formula (d) ij Represents the data point x i And x j Of the Euclidean distance between d c Denotes the truncation distance, p i Representing industrial electrical load data setMedian data point x i Is less than d c The number of points of (a);
setting a data setRepresenting local density setsIn descending order of (a) is ordered,satisfies the following conditions:
9. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 8, wherein: the Euclidean distance between every two data points in the industrial electrical load historical data set is calculated by adopting the following formula:
in the formula, x ik And x jk For industrial electrical load historical data set x i And x j The kth-dimension element of (1);
distance of truncationd c The selection process comprises the following steps:
10. The DPC-GRNN-based ultra-short-term industrial electrical load prediction method of claim 8, wherein: for the condition that the clustering center is difficult to judge by naked eyes in the decision diagram, an index gamma comprehensively considering the local density and the density distance is defined i
For the index data setArranged in descending order and drawn as gamma i A two-dimensional coordinate graph with a vertical axis and a data point subscript i in the industrial electrical load historical data set as a horizontal axis; gamma corresponding to non-cluster central point i The values are relatively smooth, and the clustering central point and the non-clustering central point correspond to gamma i There is a jump in value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210795337.0A CN114881372A (en) | 2022-07-07 | 2022-07-07 | DPC-GRNN-based ultra-short-term industrial electrical load prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210795337.0A CN114881372A (en) | 2022-07-07 | 2022-07-07 | DPC-GRNN-based ultra-short-term industrial electrical load prediction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114881372A true CN114881372A (en) | 2022-08-09 |
Family
ID=82683387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210795337.0A Pending CN114881372A (en) | 2022-07-07 | 2022-07-07 | DPC-GRNN-based ultra-short-term industrial electrical load prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114881372A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598151A (en) * | 2020-05-12 | 2020-08-28 | 辽宁工程技术大学 | Method for predicting user electricity load |
CN114580968A (en) * | 2022-03-29 | 2022-06-03 | 广东电网有限责任公司 | Power utilization management method, device, equipment and storage medium |
-
2022
- 2022-07-07 CN CN202210795337.0A patent/CN114881372A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598151A (en) * | 2020-05-12 | 2020-08-28 | 辽宁工程技术大学 | Method for predicting user electricity load |
CN114580968A (en) * | 2022-03-29 | 2022-06-03 | 广东电网有限责任公司 | Power utilization management method, device, equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
李钢等: "《基于改进密度峰值聚类的超短期工业负荷预测》", 《电测与仪表》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111783953B (en) | 24-point power load value 7-day prediction method based on optimized LSTM network | |
CN108564204B (en) | Least square support vector machine electricity quantity prediction method based on maximum correlation entropy criterion | |
CN112734128B (en) | 7-day power load peak prediction method based on optimized RBF | |
CN101863088B (en) | Method for forecasting Mooney viscosity in rubber mixing process | |
CN106055918A (en) | Power system load data identification and recovery method | |
CN111369070A (en) | Envelope clustering-based multimode fusion photovoltaic power prediction method | |
CN112990500A (en) | Transformer area line loss analysis method and system based on improved weighted gray correlation analysis | |
CN112149879A (en) | New energy medium-and-long-term electric quantity prediction method considering macroscopic volatility classification | |
CN109840633B (en) | Photovoltaic output power prediction method, system and storage medium | |
CN112016755A (en) | Construction method of universal design cost standardization technology module of power transmission and transformation project construction drawing | |
CN112801388B (en) | Power load prediction method and system based on nonlinear time series algorithm | |
CN112365056A (en) | Electrical load joint prediction method and device, terminal and storage medium | |
CN110909958A (en) | Short-term load prediction method considering photovoltaic grid-connected power | |
CN113536694B (en) | Robust optimization operation method, system and device for comprehensive energy system and storage medium | |
CN116227637A (en) | Active power distribution network oriented refined load prediction method and system | |
CN113627735A (en) | Early warning method and system for safety risk of engineering construction project | |
CN113326654A (en) | Method and device for constructing gas load prediction model | |
CN105303466A (en) | Intelligent power grid engineering project comprehensive evaluation method based on AHP-GRA | |
CN110866658A (en) | Method for predicting medium and long term load of urban power grid | |
CN115905319B (en) | Automatic identification method and system for abnormal electricity fees of massive users | |
CN114881374B (en) | Multi-element heterogeneous energy consumption data fusion method and system for building energy consumption prediction | |
CN112949207A (en) | Short-term load prediction method based on improved least square support vector machine | |
CN114154716B (en) | Enterprise energy consumption prediction method and device based on graph neural network | |
CN111311026A (en) | Runoff nonlinear prediction method considering data characteristics, model and correction | |
CN110991747A (en) | Short-term load prediction method considering wind power plant power |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220809 |
|
RJ01 | Rejection of invention patent application after publication |