CN114202012A - High-dimensional load clustering method based on recursive graph and convolution self-encoder - Google Patents
High-dimensional load clustering method based on recursive graph and convolution self-encoder Download PDFInfo
- Publication number
- CN114202012A CN114202012A CN202111366207.7A CN202111366207A CN114202012A CN 114202012 A CN114202012 A CN 114202012A CN 202111366207 A CN202111366207 A CN 202111366207A CN 114202012 A CN114202012 A CN 114202012A
- Authority
- CN
- China
- Prior art keywords
- load
- encoder
- dimensional
- clustering
- convolution self
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000000605 extraction Methods 0.000 claims abstract description 39
- 230000003595 spectral effect Effects 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 12
- 238000007619 statistical method Methods 0.000 claims description 7
- 238000010276 construction Methods 0.000 claims description 4
- 238000013135 deep learning Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 23
- 230000009467 reduction Effects 0.000 description 22
- 230000005611 electricity Effects 0.000 description 15
- 230000006399 behavior Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000000513 principal component analysis Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 4
- 238000007621 cluster analysis Methods 0.000 description 3
- 238000005065 mining Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005265 energy consumption Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 208000006083 Hypokinesia Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000004134 energy conservation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Discrete Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a high-dimensional load clustering method based on a recursive graph and a convolution self-encoder, which comprises the following steps: s1, obtaining high-dimensional load power characteristics, S2, constructing a high-dimensional load characteristic enhancement model based on a recursive graph theory; s3, constructing a high-dimensional load feature extraction model based on a convolution self-encoder; and S4, constructing a high-dimensional load clustering model based on spectral clustering to obtain a high-dimensional load clustering result. The invention can convert the one-dimensional load characteristics into the two-dimensional recursion graph characteristics to realize the characteristic enhancement, and combines the convolution self-encoder to realize the characteristic extraction, thereby achieving better clustering effect by utilizing the characteristics.
Description
Technical Field
The invention belongs to the technical field of power load clustering analysis, relates to a high-dimensional load clustering method, and particularly relates to a high-dimensional load clustering method based on a recursive graph and a convolution self-encoder.
Background
The informatization and digital transformation of the propulsion power system are important tasks for the construction of the smart power grid. Under the background, in recent years, information technology and intelligent measurement technology are rapidly developed in the field of power distribution and utilization, functions of the intelligent electric meter are continuously updated, and available load data exponentially increases. In the aspect of time dimension, the sampling period of the novel intelligent electric meter is gradually shortened from 1 day to 1 hour, half an hour and 15 minutes, the available daily load data is expanded to 1 day by 24, 48 and 96 points, and an important data base is provided for cluster analysis of the electricity consumption behaviors of users. The operation, distribution and dispatching of the intelligent power grid and the development and application of the electric power marketing service business bring new opportunities for power distribution and utilization big data value mining that power supply enterprises want to comprehensively master the energy consumption information of users, and further formulate time-of-use electricity price and demand response plan, so that the users can actively participate in power grid dispatching, the economical efficiency and the reliability of power grid operation are improved, energy-saving guidance is provided for the users, and energy conservation and emission reduction are promoted. In order to achieve the above purpose, it is necessary to deeply mine the user energy utilization rule based on the power distribution and utilization big data algorithm, sense the power utilization habits of the user, and accurately grasp the future power utilization situation.
The user electricity utilization behavior cluster analysis is a process of mining the electricity utilization characteristics of users from a massive load curve by using a clustering algorithm and obtaining a typical electricity utilization mode, and is a basis for implementing load management. The users with similar power consumption modes are divided into a group through clustering, so that power supply enterprises can grasp the energy consumption requirements of the users, and more efficient demand response management can be further developed. The power supply enterprise can also innovate the service mode on the basis of analyzing the power utilization mode of the user, for example, a step power price policy is formulated to improve the energy utilization efficiency, the user is guided to improve unreasonable energy utilization habits according to the analysis result, the power cost is reduced, abnormal power utilization detection, power quality optimization, peak clipping and valley filling and the like are carried out, and therefore the intelligent, friendly and interactive power distribution and utilization mode is achieved.
However, with the increasing of the data acquisition frequency of the smart meter, the sampling period is shortened continuously, the data dimension is increased rapidly, the traditional clustering method generally adopts the Euclidean distance as the similarity measurement function, the similarity and the difference between the power consumption behaviors of the users cannot be accurately measured in the face of high-dimensional load data, and the phenomenon of wrong and missing clustering often occurs in the practical application process, so that the situation of hypodynamia is presented in the face of a high-dimensional load curve. The load clustering method based on feature dimension reduction firstly extracts key information in high-dimensional load data, and then clusters the key information, and has advantages in clustering effect and operation efficiency methods, but at present, existing research mainly focuses on dimension reduction of original one-dimensional load data, and the difficulty in feature extraction is high, and excessive information loss is easily caused, so that how to effectively extract features of high-dimensional load features, and a better clustering effect is achieved by using the features, which is a technical problem to be urgently solved by technical personnel in the field.
Through searching, no prior art document which is the same as or similar to the prior art document is found.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a high-dimensional load clustering method based on a recursive graph and a convolution self-encoder, can convert one-dimensional load characteristics into two-dimensional recursive graph characteristics to realize characteristic enhancement, and realizes characteristic extraction by combining the convolution self-encoder, thereby achieving better clustering effect by utilizing the characteristics.
The invention solves the practical problem by adopting the following technical scheme:
a high-dimensional load clustering method based on a recursive graph and a convolution self-encoder comprises the following steps:
s1, acquiring high-dimensional load power characteristics, wherein the characteristics comprise daily load data of N users, each daily load data comprises M data points, and a load power characteristic library with the size of N multiplied by M is formed;
s2, constructing a high-dimensional load characteristic enhancement model based on a recursive graph theory: converting one-dimensional daily load characteristics in the load power characteristic library into two-dimensional recursive graph characteristics by taking a day as a unit, and forming N load recursive graph characteristics;
s3, constructing a high-dimensional load feature extraction model based on a convolution self-encoder, inputting the N load recursion graph features obtained in the step S2 into the convolution self-encoder, and training the convolution self-encoder, wherein the feature extraction dimension in the convolution self-encoder is set to be T, and after the training is finished, inputting the N load recursion graph features into the encoder of the convolution self-encoder to obtain a high-dimensional load feature extraction result, namely load key features;
s4, constructing a high-dimensional load clustering model based on spectral clustering, determining the optimal clustering number by adopting an interval statistical method, and clustering the load key characteristics obtained in the step S3 by utilizing a spectral clustering algorithm to obtain a high-dimensional load clustering result.
Moreover, the construction and training of the convolutional self-encoder in the step S3 both use tensierflow, keras deep learning toolkit in python programming language.
Moreover, the convolutional self-encoder in step S3 is composed of an encoder and a decoder, the encoder is responsible for extracting the load recursive graph features into a feature vector, and the decoder is responsible for restoring the feature vector into the original load recursive graph features; the training method of the convolution self-encoder comprises the following steps: the load recursive graph features are used as the input of the convolution self-encoder and the output of the convolution self-encoder, so that the convolution self-encoder can learn the most key information in the original load recursive graph, and the feature extraction of the load recursive graph is realized.
Also, M and N in the steps S1-S3 are both natural numbers, and M is generally greater than (or equal to) 48.
T in step S3 is a natural number, and T is smaller than M.
The invention has the advantages and beneficial effects that:
1. the invention provides a high-dimensional load characteristic enhancement method based on a recursion map, which can convert one-dimensional load characteristics into two-dimensional recursion map characteristics, is convenient for mining the stationarity and the internal similarity of a load curve, realizes characteristic enhancement and establishes a good foundation for the subsequent characteristic extraction step;
2. the invention provides a high-dimensional load characteristic extraction method based on a convolution self-encoder, which is used for performing data dimension reduction on a two-dimensional load image by utilizing the advanced characteristic extraction capability of the convolution self-encoder, reducing characteristic redundancy, extracting key load characteristics from recursive graph characteristics, reducing data volume, improving clustering efficiency and being more suitable for high-dimensional load clustering;
3. the invention provides a high-dimensional load clustering method based on spectral clustering, which is characterized in that the optimal clustering data is obtained by utilizing an interval statistical method, and the key load characteristics extracted by a convolutional self-encoder are clustered to obtain a high-dimensional load clustering result.
Drawings
FIG. 1 is a one-dimensional load signature graph provided by an embodiment of the present invention;
FIG. 2 is a two-dimensional load recursion diagram provided by an embodiment of the present invention;
FIG. 3 is a graph illustrating the results of interval statistics analysis provided by an embodiment of the present invention;
FIG. 4 is a comparison diagram of a dimension reduction algorithm provided by an embodiment of the present invention;
FIG. 5 is a comparison graph of clustering of original load curves provided by an embodiment of the present invention;
FIG. 6 is a comparison graph of clusters extracted based on RP-CAE features according to an embodiment of the present invention;
FIG. 7 is a DI index plot for different feature extraction dimensions provided by embodiments of the present invention;
fig. 8 is a DBI index graph under different feature extraction dimensions provided by the embodiment of the present invention;
fig. 9 is a high-dimensional load curve clustering result graph provided in the embodiment of the present invention.
Detailed Description
The embodiments of the invention will be described in further detail below with reference to the accompanying drawings:
the high-dimensional load clustering method based on the recursive graph and the convolution self-encoder comprises the following steps:
s1, acquiring high-dimensional load power characteristics, wherein the characteristics comprise daily load data of N users, each daily load data comprises M data points, and a load power characteristic library with the size of N multiplied by M is formed;
s2, constructing a high-dimensional load characteristic enhancement model based on a recursive graph theory: converting one-dimensional daily load characteristics in the load power characteristic library into two-dimensional recursive graph characteristics by taking a day as a unit, and forming N load recursive graph characteristics;
s3, constructing a high-dimensional load feature extraction model based on a convolution self-encoder, inputting the N load recursion graph features obtained in the step S2 into the convolution self-encoder, and training the convolution self-encoder, wherein the feature extraction dimension in the convolution self-encoder is set to be T, and after the training is finished, inputting the N load recursion graph features into the encoder of the convolution self-encoder to obtain a high-dimensional load feature extraction result, namely load key features;
s4, constructing a high-dimensional load clustering model based on spectral clustering, determining the optimal clustering number by adopting an interval statistical method, and clustering the load key characteristics obtained in the step S3 by utilizing a spectral clustering algorithm to obtain a high-dimensional load clustering result.
The construction and training of the convolutional self-encoder in the step S3 both use tensierflow and keras deep learning toolkit in python programming language;
the convolutional self-encoder in the step S3 is composed of an encoder and a decoder, the encoder is responsible for extracting the load recursive graph feature into a feature vector, and the decoder is responsible for restoring the feature vector into the original load recursive graph feature; the training method of the convolution self-encoder comprises the following steps: the load recursive graph features are used as the input of the convolution self-encoder and the output of the convolution self-encoder, so that the convolution self-encoder can learn the most key information in the original load recursive graph, and the feature extraction of the load recursive graph is realized.
M and N in the steps S1-S3 are both natural numbers, and M is generally greater than (or equal to) 48.
T in the step S3 is a natural number, and T is smaller than M.
The invention is further illustrated by the following specific examples:
as shown in fig. 1 to 9, in this embodiment, on an actual measurement data set of an erlang smart meter published by the erlang energy agency, the load clustering method of the present invention is used for load clustering, and includes the following steps:
s1, acquiring high-dimensional load power characteristics
The data set for the actual measurement of the smart electric meter in the irish embodiment includes a load curve from 2011 of 6059 household users to 2011 of 7-21-2013 of 1-6-1, and the sampling time interval of the electric meter is 30min (i.e. 48 data points are acquired in one day). Firstly, typical electricity consumption patterns of residential users are extracted, and then the load curves of all the users are subjected to cluster analysis. Because the load of residents has strong randomness and volatility, the electricity utilization habits of users cannot be accurately reflected only by using the load curve of one day, and if the selected time range is too long, the electricity utilization habits of the users may change, so the average value of the daily load data of one month of the users is used as the typical electricity utilization mode of the electricity utilization mode, and the specific time range is from 8/1/2012/8/31. In addition, because the power consumption scales of different users are different, directly clustering the power data may divide the load curves with the same power consumption law but different magnitudes into multiple classes, so the maximum and minimum normalization processing needs to be performed on the original load curves, and the power data is scaled to [0, 1], thereby ignoring the influence of the magnitudes on clustering. 6059 resident users are totally collected in the original Ireland intelligent electric meter data set, after 8 invalid users with constant power of 0 are removed, the dimension of a load matrix to be clustered is 6051 multiplied by 48, and a load power characteristic library with the size of 6059 multiplied by 48 is formed
And S2, constructing a high-dimensional load characteristic enhancement model based on the recursive graph theory. And converting the one-dimensional daily load characteristics in the load power characteristic library into two-dimensional recursive graph characteristics by taking a day as a unit, and forming 6051 load recursive graph characteristics. By way of example, a one-dimensional daily burden feature and a two-dimensional recursion plot feature are shown in fig. 1 and 2, respectively.
And S3, constructing a high-dimensional load feature extraction model based on the convolution self-encoder. Inputting the load recursion graph characteristics obtained in the step S2 into a convolution self-encoder, and training the convolution self-encoder, wherein the characteristic extraction dimension in the convolution self-encoder is set to 15. And after the training is finished, inputting the load recursive graph characteristics into an encoder of a convolution self-encoder to obtain a high-dimensional load characteristic extraction result, namely load key characteristics.
The network structure for constructing the convolutional auto-encoder in the embodiment is shown in table 1. The encoder part is provided with a layer of convolution layers, the size of the convolution kernel is (3 multiplied by 3), the number of the convolution kernels is 16, and the size of the maximum pooling layer is (2 multiplied by 2). Subsequently, two-dimensional features are converted into one-dimensional features through a Fattlen layer, and the features are compressed to 15 dimensions at a density layer (feature extraction dimensions will be discussed in section 2.4.5), so as to obtain encoded feature vectors. The decoder part first upscales the features through the Dense layer so that the feature dimensions coincide with those of the Fattlen layer output vector. The Reshape layer is then used to convert the one-dimensional feature vector into a two-dimensional feature, the dimensions being consistent with the Fattlen layer input vector. The two-dimensional features are then upsampled to achieve the inverse of the maximum pooling layer, with a window size of (2 x 2). Finally, two layers of convolution layers are arranged, the number of convolution kernels is respectively 16 and 1, the window size is (3 multiplied by 3), the former realizes the inverse transformation of a decoder, and the latter aims to convert the multilayer two-dimensional characteristics into one layer to realize the characteristic reconstruction. The results of the linear regression are transformed between 0 and 1 using the "Relu" as the activation function of the neural network in both the encoding and decoding parts and the "Sigmoid" function in the output part (i.e. the last convolutional layer). The network initial learning rate was set to 0.01, using "Adam" as the optimizer, and the Mean Square Error (MSE) as the loss function.
TABLE 1 convolutional self-encoder network architecture
And S4, constructing a high-dimensional load clustering model based on spectral clustering. And (4) determining the optimal clustering number by adopting an interval statistical method, and clustering the load key characteristics obtained in the step S3 by utilizing a spectral clustering algorithm to obtain a high-dimensional load clustering result.
The interval statistics was used to determine the optimal cluster number, and the results are shown in fig. 3. Satisfy Gapn(k)≥Gapn(k+1)-sk+1The minimum k value of (2) is 10, and therefore this embodiment takes 10 as the number of clusters.
For example, in this embodiment, Davies-Bouldin Index (DBI) and Dunn Index (Dunn Index, DI) are used to evaluate clustering effect, specifically as follows:
(1)DBI
the DBI is calculated as follows
Wherein K is a cluster number,is the average of the distances between samples in class i, ciIs the cluster center of class i. The smaller the DBI, the smaller the intra-class distance, and the larger the inter-class clustering, the better the clustering effect.
(2)DI
DI is calculated as follows
Wherein d ismin(Ci,Cj) Represents the minimum distance, dim (C), between class i and class j samplesl) Is the maximum distance between samples within class l. The larger the DI, the larger the inter-class distance, and the smaller the intra-class distance, the better the clustering effect.
The following are detailed experimental results:
1. comparison of dimension reduction algorithm
In order to verify the effectiveness of the present invention, different dimension reduction algorithms are selected to reduce the original 48-dimensional load data to 15 dimensions, and spectral clustering is implemented, so as to compare the clustering effects of the different dimension reduction algorithms. For convenience of description, the feature extraction method provided by the present invention is denoted as RP-CAE (recursive Plot and conditional Auto-encoder), and the basic introduction and parameter setting of the comparison method are as follows:
(1) principal Component Analysis (PCA): PCA is a statistical method, which converts a group of variables with correlation into linearly independent variables by means of orthogonal transformation to obtain principal component components. Because PCA is a linear dimensionality reduction algorithm, most of real high-dimensional data is linear inseparable, and dimensionality reduction quality is poor. Therefore, the Kernel principal component analysis (Kernel-PCA) firstly maps the original data to a high-dimensional feature space and then carries out principal component extraction, thereby realizing nonlinear dimension reduction and improving the dimension reduction effect. This section sets the Kernel function of Kernel-PCA to "RBF".
(2) UAE: the basic principle of UAE is described in section 2 of this chapter, this section sets a three-layer UAE, the input and output layer dimensions are both original load dimensions 48, the feature extraction layer dimension is 15, and the remaining parameters are consistent with table 1.
(3) LSTM self-encoder (LSTM-AE): LSTM-AE is a dimension reduction method combining the time sequence memory capability of LSTM and the nonlinear feature extraction capability of a self-encoder, the network structure and parameters of the section are set to be the same as those of UAE, and only the full connection layer in the network is replaced by an LSTM layer.
Fig. 4 shows the comparison of clustering indexes after different dimension reduction algorithms are combined with spectral clustering, wherein the smaller the DBI index is, the larger the DI index is, the better the clustering effect is. As can be seen from FIG. 4, better clustering effect is obtained after the dimension reduction is performed on the load data, and the effectiveness of the dimension reduction strategy is verified. In several dimension reduction algorithms, the method based on the self-encoder is superior to the traditional PCA and Kernel-PCA, and shows that compared with a statistical method, the characteristic vector extracted by the neural network model can reflect the characteristics of an original load curve better, and the dimension reduction effect is better. In the feature extraction method based on the self-encoder, the common UAE effect is the worst, the LSTM self-encoder can capture the time sequence characteristic of a load curve, so the effect is improved.
2. Clustering algorithm comparison
To verify the advancement of spectral Clustering, this embodiment compares it with the traditional Clustering algorithm, including fuzzy C-means Clustering, Gaussian Mixture Model (GMM) Clustering, and Hierarchical Clustering (HAC). The comparative experiment was divided into two parts: the original 48-dimensional daily load curve is clustered, and the result is shown in FIG. 5 after the RP-CAE is used for dimensionality reduction. As can be seen from fig. 6, compared with the conventional clustering algorithm, spectral clustering has the best effect, and the effectiveness of spectral clustering for load clustering is verified. Meanwhile, the four clustering algorithms under the two clustering strategies are consistent in performance, after RP-CAE dimensionality reduction is carried out on original load data, the effects of the four clustering algorithms are remarkably improved, and the effectiveness of the feature extraction method provided by the invention is further verified.
3. Influence of feature extraction dimensionality on clustering effect
In order to study the influence of the feature extraction dimension on the clustering effect, the embodiment implements feature extraction of different dimensions on the original 48-dimensional load data by modifying the network structure of the convolutional self-encoder, and the obtained change of the clustering validity index along with the feature extraction dimension is shown in fig. 7 and 8. As can be seen from fig. 7 and 8, as the feature extraction dimension increases, the clustering effect increases first and then decreases, and the best effect is achieved when the dimension is 15. The number of clusters selected in this embodiment is 10, and when the feature extraction dimensionality is much larger than the number of clusters, feature redundancy still exists, and the purpose of data dimension reduction is not achieved, so that the clustering effect is poor. When the feature extraction dimension is far smaller than the number of the clustering clusters, the feature vectors are difficult to comprehensively represent the load curve characteristics, the difference between the power utilization behaviors of the users cannot be distinguished, and the phenomenon of missing classification in the clustering is easily caused. Therefore, the suitable feature extraction dimension is presumed to be slightly higher than the number of clustering clusters, so that the electricity utilization features of the user can be reflected as far as possible while the dimension reduction of the data is realized, and the clustering effect is enhanced.
4. High dimensional load clustering results
In this embodiment, power consumption behavior analysis is performed based on the proposed load clustering method, and each cluster obtained by clustering represents one type of power consumption behavior, and the result is shown in fig. 9. From the clustering result, the outline of each cluster is sharp, the shape difference with other clusters is obvious, and the effect is better. From the characteristic of the load curve, the electricity utilization behaviors of residents are diversified, and single-peak, double-peak and stable electricity utilization modes exist. In this embodiment, the average value of the curves in the cluster is defined as a typical power consumption pattern, and the division result of the typical power consumption pattern of the user is shown in table 2. Categories 1, 4, 5, 6, 8, 10 are all unimodal loads, with the load curve peaking at some single point in time or a small period of time, e.g., the peaks for categories 1, 4, 8 are centered at 22:00, 17:30, and 7:00, respectively, while the peak periods for categories 5, 6, 10 are at 18: 30-21: 00 evening, 21: 30-00: 00 evening, and 9: 00-13: 00 morning, respectively; the category 3 is a double-peak load, and two electricity consumption peaks of about 10:00 in the morning and 21:30 in the evening exist; the types 2, 7 and 9 are steady loads, the load curve shows three stages of ascending, steady and descending, and the steady periods are 7: 30-21: 30, 9: 30-16: 30 and 13: 00-21: 30 respectively. In addition, the stationary period may refer to the absolute stationary state of the user load curve, such as category 7, or may refer to the state that the user keeps the amount of electricity stationary for a period of time, such as categories 2 and 9, in which the former has stronger regularity and stability than the latter.
TABLE 2 user Power usage Pattern analysis
It should be emphasized that the examples described herein are illustrative and not restrictive, and thus the present invention includes, but is not limited to, those examples described in this detailed description, as well as other embodiments that can be derived from the teachings of the present invention by those skilled in the art and that are within the scope of the present invention.
Claims (5)
1. A high-dimensional load clustering method based on a recursive graph and a convolution self-encoder is characterized in that: the method comprises the following steps:
s1, acquiring high-dimensional load power characteristics, wherein the characteristics comprise daily load data of N users, each daily load data comprises M data points, and a load power characteristic library with the size of N multiplied by M is formed;
s2, constructing a high-dimensional load characteristic enhancement model based on a recursive graph theory: converting one-dimensional daily load characteristics in the load power characteristic library into two-dimensional recursive graph characteristics by taking a day as a unit, and forming N load recursive graph characteristics;
s3, constructing a high-dimensional load feature extraction model based on a convolution self-encoder, inputting the N load recursion graph features obtained in the step S2 into the convolution self-encoder, and training the convolution self-encoder, wherein the feature extraction dimension in the convolution self-encoder is set to be T, and after the training is finished, inputting the N load recursion graph features into the encoder of the convolution self-encoder to obtain a high-dimensional load feature extraction result, namely load key features;
s4, constructing a high-dimensional load clustering model based on spectral clustering, determining the optimal clustering number by adopting an interval statistical method, and clustering the load key characteristics obtained in the step S3 by utilizing a spectral clustering algorithm to obtain a high-dimensional load clustering result.
2. The high-dimensional load clustering method based on the recursive graph and the convolution self-encoder as claimed in claim 1, wherein: the construction and training of the convolutional self-encoder in step S3 both use tensierflow, keras deep learning toolkit in python programming language.
3. The high-dimensional load clustering method based on the recursive graph and the convolution self-encoder as claimed in claim 1, wherein: the convolutional self-encoder in the step S3 is composed of an encoder and a decoder, the encoder is responsible for extracting the load recursive graph feature into a feature vector, and the decoder is responsible for restoring the feature vector into the original load recursive graph feature; the training method of the convolution self-encoder comprises the following steps: the load recursive graph features are used as the input of the convolution self-encoder and the output of the convolution self-encoder, so that the convolution self-encoder can learn the most key information in the original load recursive graph, and the feature extraction of the load recursive graph is realized.
4. The high-dimensional load clustering method based on the recursive graph and the convolution self-encoder as claimed in claim 1, wherein: m and N in the steps S1-S3 are both natural numbers, and M is generally greater than or equal to 48.
5. The high-dimensional load clustering method based on the recursive graph and the convolution self-encoder as claimed in claim 1, wherein: t in the step S3 is a natural number, and T is smaller than M.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111366207.7A CN114202012A (en) | 2021-11-17 | 2021-11-17 | High-dimensional load clustering method based on recursive graph and convolution self-encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111366207.7A CN114202012A (en) | 2021-11-17 | 2021-11-17 | High-dimensional load clustering method based on recursive graph and convolution self-encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114202012A true CN114202012A (en) | 2022-03-18 |
Family
ID=80648006
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111366207.7A Pending CN114202012A (en) | 2021-11-17 | 2021-11-17 | High-dimensional load clustering method based on recursive graph and convolution self-encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114202012A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114722943A (en) * | 2022-04-11 | 2022-07-08 | 深圳市人工智能与机器人研究院 | Data processing method, device and equipment |
CN117131397A (en) * | 2023-09-04 | 2023-11-28 | 北京航空航天大学 | Load spectrum clustering method and system based on DTW distance |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111144643A (en) * | 2019-12-24 | 2020-05-12 | 天津相和电气科技有限公司 | Day-ahead power load prediction method and device based on double-end automatic coding |
-
2021
- 2021-11-17 CN CN202111366207.7A patent/CN114202012A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111144643A (en) * | 2019-12-24 | 2020-05-12 | 天津相和电气科技有限公司 | Day-ahead power load prediction method and device based on double-end automatic coding |
Non-Patent Citations (1)
Title |
---|
ZHIQING SUN 等: ""Classification Analysis Method for Residential Electricity Consumption Behavior Based on Recurrence Plot (RP) and Convolutional Auto-Encoder (CAE)"", 《IOP CONFERENCE SERIES: EARTH AND ENVIRONMENTAL SCIENCE 645》, 25 January 2021 (2021-01-25), pages 2 - 3 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114722943A (en) * | 2022-04-11 | 2022-07-08 | 深圳市人工智能与机器人研究院 | Data processing method, device and equipment |
CN117131397A (en) * | 2023-09-04 | 2023-11-28 | 北京航空航天大学 | Load spectrum clustering method and system based on DTW distance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Short-term load forecasting with deep residual networks | |
CN110781332A (en) | Electric power resident user daily load curve clustering method based on composite clustering algorithm | |
CN114202012A (en) | High-dimensional load clustering method based on recursive graph and convolution self-encoder | |
CN108805213B (en) | Power load curve double-layer spectral clustering method considering wavelet entropy dimensionality reduction | |
CN108920720A (en) | The large-scale image search method accelerated based on depth Hash and GPU | |
CN110263873A (en) | A kind of power distribution network platform area classification method merging sparse noise reduction autoencoder network dimensionality reduction and cluster | |
Shi et al. | An approach of electrical load profile analysis based on time series data mining | |
CN107248031B (en) | Rapid power consumer classification method aiming at load curve peak-valley difference | |
CN111612319A (en) | Load curve depth embedding clustering method based on one-dimensional convolution self-encoder | |
CN113094448B (en) | Analysis method and analysis device for residence empty state and electronic equipment | |
CN114488069A (en) | Radar high-resolution range profile identification method based on graph neural network | |
Miraftabzadeh et al. | Knowledge Extraction From PV Power Generation With Deep Learning Autoencoder and Clustering-Based Algorithms | |
CN112699921B (en) | Stack denoising self-coding-based power grid transient fault data clustering cleaning method | |
CN117113126A (en) | Industry electricity utilization characteristic analysis method based on improved clustering algorithm | |
CN111026741A (en) | Data cleaning method and device based on time series similarity | |
CN115526264A (en) | User power consumption behavior classification analysis method based on self-encoder | |
CN116304295A (en) | User energy consumption portrait analysis method based on multivariate data driving | |
CN112270084B (en) | Data-driven high-proportion renewable energy power system operation scene identification method | |
Zhou et al. | Characteristic representation of stock time series based on trend feature points | |
Shen et al. | A Novel AI-based Method for EV Charging Load Profile Clustering | |
CN111768066B (en) | Park electric heating load coupling relation analysis method and device based on fusion characteristics | |
Li et al. | Comparison and application potential analysis of autoencoder-based electricity pattern mining algorithms for large-scale demand response | |
CN113269360A (en) | Data acquisition method based on power consumer electricity consumption behavior portrait | |
Wang et al. | Analysis of user’s power consumption behavior based on k-means | |
CN113469269A (en) | Residual convolution self-coding wind-solar-charged scene generation method based on multi-channel fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |