CN115687955A - Voting based resident user load curve clustering method and device - Google Patents

Voting based resident user load curve clustering method and device Download PDF

Info

Publication number
CN115687955A
CN115687955A CN202310000646.9A CN202310000646A CN115687955A CN 115687955 A CN115687955 A CN 115687955A CN 202310000646 A CN202310000646 A CN 202310000646A CN 115687955 A CN115687955 A CN 115687955A
Authority
CN
China
Prior art keywords
clustering
load curve
algorithm
data set
user load
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310000646.9A
Other languages
Chinese (zh)
Inventor
丁贵立
韩威
章彧涵
许志浩
王宗耀
康兵
张兴旺
程巧
戴永熙
郑芯蕊
杨勇
曹昆峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Institute of Technology
Original Assignee
Nanchang Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Institute of Technology filed Critical Nanchang Institute of Technology
Priority to CN202310000646.9A priority Critical patent/CN115687955A/en
Publication of CN115687955A publication Critical patent/CN115687955A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention belongs to the technical field of power load monitoring, and discloses a method and a device for clustering a residential user load curve based on voting, wherein the method realizes high-dimensional data dimensionality reduction through integrated tree fitting, and determines the optimal clustering number by adopting a contour coefficient; and determining a reference clustering algorithm according to a CH criterion, and finally uniformly integrating clustering results through a consistency function matrix. The invention can integrate the advantages of the clustering algorithm of each member, has great improvement effect on the aspects of clustering precision, clustering effect and robustness, and can accurately identify the energy utilization characteristics of the user.

Description

Voting based resident user load curve clustering method and device
Technical Field
The invention belongs to the technical field of power load monitoring, and particularly relates to a residential user load curve clustering method and device based on voting.
Background
The power consumer is an important component of the power system, and the intrinsic rule of deeply mining the load data of the power consumer has important significance for planning, running and the like of the power system, so that the power consumer is widely concerned and researched.
Ensemble clustering is an unsupervised learning method. The objective is to aggregate a plurality of different clustering results (called base clustering results) into one clustering result by using a certain combination method. The method aims to use a certain method or make the integrated clustering obtain the advantages of each clustering algorithm according to a certain relation, so as to obtain a high-efficiency clustering result.
The idea of the voting method is to maximally utilize a clustering algorithm to distinguish data. The clustering algorithm classifies data according to clustering results, but each clustering result does not have a uniform identifier for class division, so that a uniform function of different clustering algorithm results needs to be designed in the voting method to process the problems corresponding to the clustering results of different clustering algorithms. Simply speaking, voting is similar to voting in life, voters vote candidates, more candidates win the vote, correspondingly, samples correspond to the candidates, a clustering algorithm corresponds to the voters, and attribution category labels of the samples are obtained through voting.
Disclosure of Invention
In order to know the electricity consumption behavior characteristics of residential users, a scientific demand response regulation and control strategy is formulated, the users are guided to reasonably use electricity, peak clipping and valley filling are avoided, and reasonable allocation of power resources is realized. The invention provides a resident user load curve clustering method and a device based on voting, which realize high-dimensional data dimensionality reduction through integrated tree fitting, overcome the correlation of clustering index dimensionality by adopting the Mahalanobis distance, and further determine the effective number of clustering; and determining a reference clustering algorithm according to a CH criterion, and finally uniformly integrating clustering results through a consistency function matrix. The invention can be used for screening out power users with different electricity utilization characteristics, screening out clients with different levels meeting the requirement response activities, and aiming at the power users with different levels, a power grid company can aim at the users with different electricity utilization characteristics and then adopt different methods to carry out the requirement response activities, thereby being beneficial to saving the activity cost, improving the electricity saving efficiency, guiding the users to use electricity scientifically and reasonably and realizing the reasonable distribution of power resources.
The invention is realized by the following technical scheme. A resident user load curve clustering method based on voting comprises the following steps:
a resident user load curve clustering method based on voting comprises the following steps:
step 1, establishing an original sample data set, analyzing the original sample data set through an empirical mode decomposition method, and constructing overall characteristics and local characteristics of user load curve data to obtain a characteristic data set;
step 2, fitting the feature data set based on the integrated tree model, extracting characteristic indexes of the feature data set to realize dimension reduction, obtaining feature quantity reflecting user energy features, and constructing a dimension reduction data set;
step 3, determining one member clustering algorithm as a reference algorithm based on Calinski-Harabasz (CH) indexes, and taking a clustering result of the reference algorithm as a reference clustering result;
step 3.1, determining the quality of the clustering result through the contour coefficient so as to select the optimal clustering number;
step 3.2, clustering the dimensionality reduction data set obtained in the step 2 by a member clustering algorithm according to the optimal clustering number obtained in the step 3.1 to obtain a clustering result;
step 3.3, measuring the effectiveness of the clustering conclusion of the member clustering algorithm through a Calinski-Harabasz index, and selecting a reference clustering algorithm;
step 4, constructing a consistency function, and unifying different member clustering algorithm class labels; and dividing the samples into full standard samples and non-full standard samples and outputting clustering results.
In the step 1, the missing power consumption is fitted by a least square method, and the user load curve data is supplemented, so that an original sample data set is constructed; the user load curve data comprises a daily electricity load curve, a monthly electricity load curve and an annual electricity load curve; and analyzing the original sample data set by an empirical mode decomposition method, and constructing the overall characteristics and the local characteristics of the user load curve data.
The overall characteristics comprise average power consumption of residents, standard deviation of power consumption sequences of the residents, historical power consumption kurtosis and long-term and short-term trends of the power consumption of the residents.
The local features include approximate entropies and quantiles of periodicity and volatility of the time series.
The optimal cluster number is calculated as followsS i
Figure 224159DEST_PATH_IMAGE001
In the formula (I), the compound is shown in the specification,b i representative sampleiThe minimum of the average distances to the samples belonging to the other classes,a i representative sampleiThe average distance to other samples of the category to which it belongs is calculated as follows:
Figure 791538DEST_PATH_IMAGE002
where dis represents a sample in the same classiAnd a samplejThe distance (c) is calculated by using the Euclidean distance, and the formula is as follows:
Figure 122025DEST_PATH_IMAGE003
nthe number of samples is represented as a function of,x p representing a samplexTo (1) apThe index value is set according to the index value,y p representing a sampleyTo (1) apAn index value.
Calinski-Harabasz index was calculated as follows:
Figure 986689DEST_PATH_IMAGE004
Figure 495161DEST_PATH_IMAGE005
Figure 64683DEST_PATH_IMAGE006
Figure 418435DEST_PATH_IMAGE007
Figure 874955DEST_PATH_IMAGE008
in the formula (I), the compound is shown in the specification,krepresenting the number of clusters;Na number of samples representing the reduced-dimension dataset;SS B is the between-class variance, SS W Is the intra-class variance;B k is the distance between classes;W k is an intra-class distance;n q is aqThe number of data samples of (a);c q is aqThe cluster center of (a);c E is the cluster center of all classes;c q Is aqThe set of medium data samples is then compared to the set of medium data samples,Trepresenting the transpose of the matrix.
In step 4, one member clustering algorithm selected in step 3 is used as a reference algorithm, and the other member clustering algorithms are compared with the reference algorithm; selectingC 1 As a benchmark clustering algorithm, dividing the dimensionality reduction data set intokClass, build one and other member clustering algorithmsC o oA uniform matrix of results of =2,3,4.):
Figure 538018DEST_PATH_IMAGE009
in the formula (I), the compound is shown in the specification,S o1 is a reference clustering algorithmC 1 And member clustering algorithmC o A unified matrix of results of (a); s mw representing benchmark clustering algorithmC 1 Class (1)mAnd member clustering algorithmC o Class (1)wThe number of samples that overlap.
The invention also provides a resident user load curve clustering device based on voting, which comprises the following steps:
the data characteristic extraction module is internally provided with an empirical mode decomposition method for extracting the characteristics of the original sample data set;
the data fitting module is used for fitting the characteristic data set based on the integrated tree model, extracting characteristic indexes of the characteristic data set to realize dimension reduction, and obtaining characteristic quantity reflecting user energy characteristics;
a reference clustering algorithm selecting module, which determines one member clustering algorithm as a reference algorithm based on Calinski-Harabasz (CH) indexes;
and the consistency unification module is used for unifying different member clustering algorithm class labels based on a consistency function.
The invention provides a nonvolatile computer storage medium, which stores computer executable instructions, wherein the computer executable instructions can execute the resident user load curve clustering method based on voting.
The present invention also provides a computer program product comprising a computer program stored on a non-volatile computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-mentioned voting-based resident user load curve clustering method.
The invention has the beneficial effects that: the method is an important means for feeding back the power supply and demand relationship in time, and has important significance for improving the operation efficiency of the power system and maintaining the stable operation of the power. Specifically, the power consumption of the user in the peak electricity consumption period can be reduced, and the user is guided to start the equipment power consumption in the underestimation electricity consumption period; the power utilization mode of the user can be changed according to the self power utilization characteristics; the method can help the power grid enterprise to establish a more perfect service mechanism for the demand response.
According to the method, by considering the characteristics of the user load curve data, aiming at the problem that the clustering effect, the clustering precision and the robustness are difficult to be considered, the integrated clustering algorithm based on voting is provided, the advantages of the clustering algorithms of all members are integrated, the method has great improvement effects in the aspects of the clustering precision, the clustering effect and the robustness, the energy utilization characteristics of the users can be accurately identified, the grid company is helped to accurately invite the demand response to the potential participating users, the cost is saved, and the development effect of demand response activities is effectively improved.
Drawings
FIG. 1 is a flow chart of the resident user load curve clustering method based on voting according to the invention.
Detailed Description
The invention is explained in more detail below with reference to the figures and examples.
Referring to fig. 1, a method for clustering a load curve of a resident user based on voting comprises the following steps:
step 1, establishing an original sample data set, analyzing the original sample data set through an empirical mode decomposition method, and constructing the overall characteristics and the local characteristics of user load curve data to obtain a characteristic data set;
fitting the missing power consumption by a least square method, and supplementing the user load curve data so as to construct an original sample data set; the user load curve data comprises a daily power load curve (obtained by collecting data every 15 min by an HPLC intelligent electric energy meter), a monthly power load curve and an annual power load curve; analyzing an original sample data set by an empirical mode decomposition method, and constructing the overall characteristics and the local characteristics of the user load curve data; the overall characteristics comprise average power consumption of residents, standard deviation of power consumption sequences of residents, historical power consumption kurtosis, long-term and short-term trends of the power consumption of residents and the like. The local features include approximate entropies and quantiles of periodicity and volatility of the time series.
Step 2, fitting the feature data set based on the integrated tree model, extracting characteristic indexes of the feature data set to realize dimension reduction, obtaining feature quantity reflecting user energy features, and constructing a dimension reduction data set:
inputting the feature data set into an integrated tree model as an independent variable, taking the user participation demand response as 1 and the non-participation demand response as 0, and carrying out integrated tree model training; the training process is performed based on cross-validation grid search, and the integrated tree model parameters are set as follows:
Figure 608218DEST_PATH_IMAGE010
setting a model information gain degree evaluation index (criterion) as a disorder state (entry) mode, setting the maximum feature number (max _ features) of a single tree construction process as 16, setting the minimum separation sample number (min _ samples _ split) parameter as 6, setting the tree number (n _ estimators) of random forests as 100, setting the maximum depth (max _ depth) of the single tree as 21, and fitting the integrated tree model to obtain representative 6 electricity utilization dimension indexes and physical meanings of the indexes as shown in the following table:
Figure 765661DEST_PATH_IMAGE011
under the condition that only the electricity utilization condition of the user is known, more representative energy utilization characteristic quantities can be mined through the steps 1 and 2, the user energy utilization image can be more accurately constructed, and the analysis effect is improved.
Step 3, determining one member clustering algorithm as a reference algorithm based on Calinski-Harabasz (CH) indexes, and taking a clustering result of the reference algorithm as a reference clustering result;
step 3.1, determining the quality of the clustering result through the contour coefficient so as to select the optimal clustering numberS i
Figure 857113DEST_PATH_IMAGE012
In the formula (I), the compound is shown in the specification,b i representative sampleiMinimum value of average distance from samples belonging to other classes, a i Representative sampleiThe average distance to other samples of the class to which it belongs is calculated as follows
Figure 441810DEST_PATH_IMAGE013
Where dis represents a sample in the same classiAnd a samplejThe distance (c) is calculated by using the Euclidean distance, and the formula is as follows:
Figure 720344DEST_PATH_IMAGE014
the meaning is the set distance of two elements in Euclidean space, which is widely used for identifying the dissimilarity degree of two scalar elements because of intuitive understandability and strong interpretability,nthe number of samples is indicated to be,x p representing a samplexTo (1) apThe index value is set according to the index value,y p representing a sampleyTo (1) apAn index value.
Step 3.2, clustering the dimensionality reduction data set obtained in the step 2 by a member clustering algorithm according to the optimal clustering number obtained in the step 3.1 to obtain a clustering result;
and 3.3, measuring the effectiveness of the clustering conclusion of the member clustering algorithm through a Calinski-Harabasz (CH) index, and selecting the reference clustering algorithm. The CH index is a score calculated by evaluating the variance between classes and the variance in the classes, and the larger the value is, the closer the classes are, the more dispersed the classes are, i.e. the better clustering result is.
Figure 415899DEST_PATH_IMAGE015
Figure 844082DEST_PATH_IMAGE016
Figure 114526DEST_PATH_IMAGE017
Figure 631089DEST_PATH_IMAGE018
Figure 379602DEST_PATH_IMAGE019
In the formula (I), the compound is shown in the specification,krepresenting the number of clusters;Na number of samples representing the reduced-dimension dataset;SS B is the between-class variance, SS W Is the intra-class variance;B k is the distance between classes;W k is an intra-class distance;n q is aqThe number of data samples of (a);c q is of the classqThe cluster center of (a);c E is the cluster center of all classes;c q is aqThe set of medium data samples is then compared to the set of medium data samples,Trepresenting the transpose of the matrix.
Step 4, constructing a consistency function, and unifying different member clustering algorithm class labels; and dividing the samples into full standard samples and non-full standard samples and outputting clustering results.
And 3, taking one member clustering algorithm selected in the step 3 as a reference algorithm, and comparing the other member clustering algorithms with the reference algorithm. Hypothesis selectionC 1 As a benchmark clustering algorithm, dividing the dimensionality reduction data set intokClass, building one and other member clustering algorithmC o oA uniform matrix of results of =2,3,4.):
Figure 930800DEST_PATH_IMAGE020
in the formula (I), the compound is shown in the specification,S o1 is a benchmark clustering algorithmC 1 And member clustering algorithmC o A unified matrix of results of (a); s mw representing benchmark clustering algorithmC 1 Class (1)mAnd member clustering algorithmC o Class (1)wThe number of samples that overlap. Taking the element subscript corresponding to the maximum value of each row of data as a category matching label, if the element subscript is on the second row of datamIf the row is the maximum value of the row, the reference clustering algorithmC 1 InmClass and member clustering algorithmC o In (1)wThe classes are corresponding class labels, and the class labels of different clustering algorithms can be unified through the method.
The application case is as follows: the research data is derived from data of a resident intelligent energy consumption service specimen bank accumulated in a resident user demand response test carried out in Jiangxi province from 2019 to 2021, and comprises 96-point daily load data acquired by an HPLC (high performance liquid chromatography) intelligent electric meter. After screening, 1694 users are selected as research objects, and the 96-point daily load curve data (including working days and rest days) of the 1694 users are subjected to integrated clustering example analysis.
Processing the research data according to the step 1;
according to the step 2, the characteristic indexes obtained by processing the daily load curve of 96 points of 1694 user residents are used as the input quantity of each algorithm;
and 3, selecting one clustering algorithm as a reference algorithm, and comparing the other clustering algorithms with the reference algorithm. And clustering the daily load data in the data set by using each clustering member clustering algorithm. And 4 member clustering algorithms are selected, including k-means, gray wolf optimization fuzzy C-means, gaussian fuzzy (Gaussian fuzzy) clustering and self-organizing map (SOM) algorithms. Table 3 shows the contour coefficient index and the total index of the clustering result when each member clustering algorithm operates independently given different clustering numbers. It can be seen that when the number of clusters is 3, the total score is highest for 4 algorithms and the profile coefficient scores for 4 algorithms all get higher scores, so the best number of clusters chosen herein is k =3.
Figure 857299DEST_PATH_IMAGE021
According to the step 3 and the step 4, as can be seen from the table 4 according to the CH criterion, the 4 clustering algorithms are more stable to be exerted on the specimen bank resident user data set. And (3) taking the characteristic indexes obtained by reducing the dimension of the 96-point daily load curve of 1694 user residential users as the input quantity of each algorithm, and testing the member clustering algorithm and the voting integration algorithm. The following table shows the clustering CH information values of the member clustering algorithm and voting integrated clustering results.
Figure 110426DEST_PATH_IMAGE022
Among the 4 member clustering algorithms, the clustering stability of the traditional k-means algorithm (k-means) and the SOM (self-organizing mapping network) algorithm is higher than that of the other two member clustering algorithms, and the clustering effect is kept at a better level. The clustering effect of the gray wolf algorithm optimized fuzzy C-means clustering (GWO-FCM) is ranked first in the scores of the 4 member clustering algorithms, but the stability is not as good as that of the k-means algorithm and the SOM algorithm. By taking CH as an index for measuring the effectiveness of the clustering effect, the integrated clustering algorithm keeps the front row in the CH value sequencing of working days and rest days, improves the average effectiveness of the clustering result by 31% compared with the average values of the CH indexes of k-means, GWO-FCM, gaussian fuzzy (Gaussian fuzzy) and self-organizing map (SOM) algorithms by 34.61%,7.38%,57.72% and 24.30%, and verifies that the voting integration algorithm provided by the method has a remarkable improvement effect on the clustering effectiveness of the resident data of a certain provincial resident specimen bank.
The embodiment provides a resident user load curve clustering device based on voting, which comprises:
the data characteristic extraction module is internally provided with an empirical mode decomposition method for extracting the characteristics of the original sample data set;
the data fitting module is used for fitting the characteristic data set based on the integrated tree model, extracting characteristic indexes of the characteristic data set to realize dimension reduction, and obtaining characteristic quantity reflecting user energy characteristics;
a reference clustering algorithm selecting module, which determines one member clustering algorithm as a reference algorithm based on Calinski-Harabasz (CH) indexes;
and the consistency unification module unifies different member clustering algorithm class labels based on a consistency function.
In still other embodiments, a non-transitory computer storage medium is provided, the computer storage medium storing computer-executable instructions that can perform the voting-based resident user load curve clustering method in any of the above embodiments.
The present embodiment also provides a computer program product comprising a computer program stored on a non-volatile computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to execute the resident user load curve clustering method based on voting by the above-mentioned embodiment.
The present embodiment provides an electronic device, including: one or more processors, and a memory. The electronic device may further include: an input device and an output device. The processor, memory, input device, and output device may be connected by a bus or other means. The memory is the non-volatile computer-readable storage medium described above. The processor executes various functional applications and data processing of the server by running the nonvolatile software program, instructions and modules stored in the memory, namely, the voting-based resident user load curve clustering method in the above embodiment is realized. The input means may receive input numerical or character information and generate key signal inputs related to user settings and function control of the voting-based resident user load curve clustering method. The output device may include a display device such as a display screen.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be implemented by adopting various computer languages, such as object-oriented programming language Java and transliterated scripting language JavaScript.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A resident user load curve clustering method based on voting is characterized by comprising the following steps:
step 1, establishing an original sample data set, analyzing the original sample data set through an empirical mode decomposition method, and constructing overall characteristics and local characteristics of user load curve data to obtain a characteristic data set;
step 2, fitting the feature data set based on the integrated tree model, extracting characteristic indexes of the feature data set to realize dimension reduction, obtaining feature quantity reflecting user energy features, and constructing a dimension reduction data set;
step 3, determining one member clustering algorithm as a reference algorithm based on the Calinski-Harabasz index, and taking a clustering result of the reference algorithm as a reference clustering result;
step 3.1, determining the quality of the clustering result through the contour coefficient so as to select the optimal clustering number;
step 3.2, clustering the dimensionality reduction data set obtained in the step 2 by a member clustering algorithm according to the optimal clustering number obtained in the step 3.1 to obtain a clustering result;
step 3.3, measuring the effectiveness of the clustering conclusion of the member clustering algorithm through a Calinski-Harabasz index, and selecting a reference clustering algorithm;
step 4, constructing a consistency function, and unifying different member clustering algorithm class labels; and dividing the samples into full standard samples and non-full standard samples and outputting clustering results.
2. A voting based resident user load curve clustering method according to claim 1, wherein in step 1, the user load curve data is supplemented by least square fitting of the missing power consumption, so as to construct an original sample data set; the user load curve data comprises a daily power load curve, a monthly power load curve and an annual power load curve; and analyzing the original sample data set by an empirical mode decomposition method, and constructing the overall characteristics and the local characteristics of the user load curve data.
3. A voting-based resident user load curve clustering method according to claim 2, wherein the overall characteristics comprise average resident electricity consumption, standard deviation of resident electricity consumption sequences, peak degree of historical electricity consumption, long-term and short-term trends of resident electricity consumption.
4. A voting based resident user load curve clustering method according to claim 2, wherein the local features include approximate entropies and quantiles of periodicity and volatility of the time series.
5. A voting based resident user load curve clustering method according to claim 1, wherein the optimal number of clusters is calculated as followsS i
Figure 207745DEST_PATH_IMAGE001
In the formula (I), the compound is shown in the specification,b i representative sampleiThe minimum of the average distances to the samples belonging to the other classes,a i representative sampleiThe average distance to other samples of the category to which it belongs is calculated as follows:
Figure 718361DEST_PATH_IMAGE002
where dis represents a sample in the same classiAnd a samplejThe distance (c) is calculated by using the Euclidean distance, and the formula is as follows:
Figure 107885DEST_PATH_IMAGE003
nthe number of samples is represented as a function of,x p representing a samplexTo (1) apThe index value is set according to the index value,y p representing a sampleyTo (1) apAn index value.
6. A voting-based resident user load curve clustering method according to claim 1, wherein the Calinski-Harabasz index is calculated as follows:
Figure 644039DEST_PATH_IMAGE004
Figure 596427DEST_PATH_IMAGE005
Figure 28677DEST_PATH_IMAGE006
Figure 171076DEST_PATH_IMAGE007
Figure 714184DEST_PATH_IMAGE008
in the formula (I), the compound is shown in the specification,krepresenting the number of clusters;Na number of samples representing the reduced-dimension dataset;SS B is the between-class variance, SS W Is the intra-class variance;B k is the distance between classes;W k is an intra-class distance;n q is aqThe number of data samples of (a);c q is aqThe cluster center of (a);c E is the cluster center of all classes;c q is aqThe set of medium data samples is then compared to the set of medium data samples,Trepresenting the transpose of the matrix.
7. A voting based resident user load curve clustering method according to claim 1, wherein in step 4, one member clustering algorithm selected in step 3 is used as a reference algorithm, and the remaining member clustering algorithms are compared with the reference algorithm; selectingC 1 As a benchmark clustering algorithm, dividing the dimensionality reduction data set intokClass, build one and other member clustering algorithmsC o Unifying the matrix of the results of (1):
Figure 13358DEST_PATH_IMAGE009
in the formula (I), the compound is shown in the specification,S o1 is a reference clustering algorithmC 1 And member clustering algorithmC o A unified matrix of results of (a); s mw representing benchmark clustering algorithmC 1 Class (1)mAnd member clustering algorithmC o Class (1)wThe number of samples that overlap.
8. A resident user load curve clustering device based on voting is characterized by comprising the following components:
the data characteristic extraction module is internally provided with an empirical mode decomposition method for extracting the characteristics of the original sample data set;
the data fitting module is used for fitting the characteristic data set based on the integrated tree model, extracting characteristic indexes of the characteristic data set to realize dimension reduction, and obtaining characteristic quantity reflecting user energy characteristics;
a reference clustering algorithm selecting module, which determines one member clustering algorithm as a reference algorithm based on the Calinski-Harabasz index;
and the consistency unification module is used for unifying different member clustering algorithm class labels based on a consistency function.
9. A non-volatile computer storage medium, wherein the computer storage medium stores computer-executable instructions for performing the voting based residential user load curve clustering method according to any one of claims 1 to 7.
10. A computer program product, characterized in that the computer program product comprises a computer program stored on a non-volatile computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method for clustering voting-based residential user load curves according to any one of claims 1 to 7.
CN202310000646.9A 2023-01-03 2023-01-03 Voting based resident user load curve clustering method and device Pending CN115687955A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310000646.9A CN115687955A (en) 2023-01-03 2023-01-03 Voting based resident user load curve clustering method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310000646.9A CN115687955A (en) 2023-01-03 2023-01-03 Voting based resident user load curve clustering method and device

Publications (1)

Publication Number Publication Date
CN115687955A true CN115687955A (en) 2023-02-03

Family

ID=85057061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310000646.9A Pending CN115687955A (en) 2023-01-03 2023-01-03 Voting based resident user load curve clustering method and device

Country Status (1)

Country Link
CN (1) CN115687955A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781332A (en) * 2019-10-16 2020-02-11 三峡大学 Electric power resident user daily load curve clustering method based on composite clustering algorithm
US20210056647A1 (en) * 2019-08-23 2021-02-25 North China Electric Power University Method for multi-dimensional identification of flexible load demand response effect
CN112820416A (en) * 2021-02-26 2021-05-18 重庆市公共卫生医疗救治中心 Major infectious disease queue data typing method, typing model and electronic equipment
CN114336651A (en) * 2022-01-04 2022-04-12 国网四川省电力公司营销服务中心 Power dispatching method and device based on peak clipping potential
CN114897451A (en) * 2022-07-13 2022-08-12 南昌工程学院 Double-layer clustering correction method and device considering key features of demand response user

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210056647A1 (en) * 2019-08-23 2021-02-25 North China Electric Power University Method for multi-dimensional identification of flexible load demand response effect
CN110781332A (en) * 2019-10-16 2020-02-11 三峡大学 Electric power resident user daily load curve clustering method based on composite clustering algorithm
CN112820416A (en) * 2021-02-26 2021-05-18 重庆市公共卫生医疗救治中心 Major infectious disease queue data typing method, typing model and electronic equipment
CN114336651A (en) * 2022-01-04 2022-04-12 国网四川省电力公司营销服务中心 Power dispatching method and device based on peak clipping potential
CN114897451A (en) * 2022-07-13 2022-08-12 南昌工程学院 Double-layer clustering correction method and device considering key features of demand response user

Similar Documents

Publication Publication Date Title
Topchy et al. Adaptive clustering ensembles
Zekić-Sušac et al. Predicting energy cost of public buildings by artificial neural networks, CART, and random forest
US10347019B2 (en) Intelligent data munging
Liaw et al. Classification and regression by randomForest
Zeileis et al. Model-based recursive partitioning
CN114021799A (en) Day-ahead wind power prediction method and system for wind power plant
Chormunge et al. Efficient Feature Subset Selection Algorithm for High Dimensional Data.
CN110674636B (en) Power consumption behavior analysis method
CN112186761B (en) Wind power scene generation method and system based on probability distribution
Huang et al. Harmonious genetic clustering
CN114897451B (en) Double-layer clustering correction method and device considering key features of demand response user
CN113094448B (en) Analysis method and analysis device for residence empty state and electronic equipment
Williams et al. Package ‘caret’
CN111797899B (en) Low-voltage transformer area kmeans clustering method and system
CN111090679B (en) Time sequence data representation learning method based on time sequence influence and graph embedding
CN115687955A (en) Voting based resident user load curve clustering method and device
Ribeiro et al. Extracting discriminative features using non-negative matrix factorization in financial distress data
CN115935212A (en) Adjustable load clustering method and system based on longitudinal trend prediction
CN116011655A (en) Load ultra-short-term prediction method and system based on two-stage intelligent feature engineering
CN115081533A (en) Client side load prediction method and system based on two-stage clustering and MGRU-AT
Aguilar et al. Grouped heterogeneity in linear panel data models with heterogeneous error variances
CN113420887A (en) Prediction model construction method and device, computer equipment and readable storage medium
Gonzales et al. Distance Metric Recommendation for k-Means Clustering: A Meta-Learning Approach
CN112241922A (en) Power grid asset comprehensive value evaluation method based on improved naive Bayes classification
CN111127184A (en) Distributed combined credit evaluation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination