CN113837274A - User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis - Google Patents

User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis Download PDF

Info

Publication number
CN113837274A
CN113837274A CN202111120375.8A CN202111120375A CN113837274A CN 113837274 A CN113837274 A CN 113837274A CN 202111120375 A CN202111120375 A CN 202111120375A CN 113837274 A CN113837274 A CN 113837274A
Authority
CN
China
Prior art keywords
user
electricity consumption
feature
characteristic
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111120375.8A
Other languages
Chinese (zh)
Inventor
胡宏彬
韩俊飞
王宇强
张一帆
尹柏清
钟鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia Electric Power Research Institute of Inner Mongolia Power Group Co Ltd
Original Assignee
Inner Mongolia Electric Power Research Institute of Inner Mongolia Power Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia Electric Power Research Institute of Inner Mongolia Power Group Co Ltd filed Critical Inner Mongolia Electric Power Research Institute of Inner Mongolia Power Group Co Ltd
Priority to CN202111120375.8A priority Critical patent/CN113837274A/en
Publication of CN113837274A publication Critical patent/CN113837274A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a user electricity consumption behavior portrait method and a device based on electricity consumption characteristic analysis, wherein the method comprises the following steps: preprocessing the collected load data; performing clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm, and performing clustering division on the power utilization behaviors of the users; extracting the characteristics of the power utilization behaviors of the users in a random forest mode, and determining the characteristic labels of the power utilization behaviors of the users; and acquiring the user electricity consumption behavior portrait according to the cluster of the user electricity consumption behavior and the feature tag.

Description

User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis
Technical Field
The invention relates to the technical field of electric power big data analysis, in particular to a user electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis.
Background
With the gradual opening of the power market and the continuous development of the comprehensive energy system, the problems such as diversified and improper energy consumption modes of users become more prominent, and higher requirements are put forward on the analysis of the power utilization behaviors of the users. The resident electricity consumption behavior analysis is the basis for deeply excavating resident demand response potential and improving accurate power service level. With the popularization of intelligent electric meters, based on a large amount of power consumption data, how to analyze the power consumption data of residential users by applying a data mining technology is a problem which is a key concern of power companies and thus portrays the power consumption behaviors of the users.
However, in the research and practice process of the prior art, the inventor of the present invention finds that most of the current research focuses on dividing different power users into different types of clusters according to a clustering algorithm, such research focuses on user classification, but most of the feature set selection depends on expert experience, and the validity of the power consumption behavior feature set is to be verified without data analysis and optimization selection. The selection of the user electricity utilization characteristics is the basis of the power user portrait, so a technical scheme for analyzing the user electricity utilization characteristics and reflecting the user electricity utilization behavior characteristics to the maximum extent while meeting the requirement of compactness needs to be selected urgently.
Disclosure of Invention
The invention aims to provide a user electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis, and aims to solve the problems in the prior art.
The invention provides a user electricity consumption behavior image method based on electricity consumption characteristic analysis, which comprises the following steps:
preprocessing the collected load data;
performing clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm, and performing clustering division on the power utilization behaviors of the users;
extracting the characteristics of the power utilization behaviors of the users in a random forest mode to determine the characteristic labels of the power utilization behaviors of the users;
and acquiring the user electricity consumption behavior portrait according to the cluster of the user electricity consumption behavior and the feature tag.
The invention provides a user power consumption behavior portrait device based on power consumption characteristic analysis, which comprises:
the preprocessing module is used for preprocessing the collected load data;
the clustering module is used for carrying out clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm and carrying out clustering division on the power utilization behaviors of the users;
the characteristic extraction module is used for extracting the characteristics of the power utilization behaviors of the users in a random forest mode and determining the characteristic labels of the power utilization behaviors of the users;
and the power consumption behavior portrait module is used for acquiring the user power consumption behavior portrait according to the cluster of the user power consumption behaviors and the feature tag.
The embodiment of the invention also provides a user electricity consumption behavior portrait device based on electricity consumption characteristic analysis, which comprises: the power consumption behavior profiling method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the steps of the power consumption behavior profiling method based on power consumption characteristic analysis are realized.
The embodiment of the invention also provides a computer readable storage medium, wherein an implementation program for information transmission is stored on the computer readable storage medium, and when the program is executed by a processor, the steps of the user electricity consumption behavior representation method based on electricity consumption characteristic analysis are implemented.
By adopting the embodiment of the invention, the data mining technology is applied to carry out cluster analysis on the electricity consumption data of the residential users, a label system for representing the electricity consumption characteristics of the residential load is established, the electricity consumption behavior portrait of the users is depicted, and the accuracy of the analysis of the user behavior can be improved, so that the electricity consumption habits and the behavior characteristics of the users are reflected, and a basis is provided for the demand response decision.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a user electricity consumption behavior profiling method based on electricity consumption characteristic analysis according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a detailed process of a user power consumption behavior profiling method based on power consumption characteristic analysis according to an embodiment of the present invention;
FIG. 3 is a schematic view of a broken line of different k values of user daily load curve clustering and corresponding total intra-cluster dispersion square sums according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a user daily load curve clustering analysis result according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a visualization result of time characteristics of power consumption behavior of a user according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a user power usage behavior feature tag visualization result according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a user power consumption behavior representation apparatus based on power consumption characteristic analysis according to a first embodiment of the present invention;
FIG. 8 is a schematic diagram of a user power consumption behavior representation apparatus based on power consumption characteristic analysis according to a second embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise. Furthermore, the terms "mounted," "connected," and "connected" are to be construed broadly and may, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Method embodiment
According to an embodiment of the present invention, there is provided a user electricity consumption behavior imaging method based on electricity consumption characteristic analysis, fig. 1 is a flowchart of the user electricity consumption behavior imaging method based on electricity consumption characteristic analysis according to the embodiment of the present invention, and as shown in fig. 1, the user electricity consumption behavior imaging method based on electricity consumption characteristic analysis according to the embodiment of the present invention specifically includes:
step 101, preprocessing collected load data; the data preprocessing comprises the following specific steps:
and step 1011, performing data cleaning on the user load data.
Step 1012, normalize the user load data.
Step 102, performing clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm, and performing clustering division on the power utilization behaviors of the users; step 102 specifically includes:
step 1021, similarity calculation is performed according to formula 1:
Figure BDA0003276863040000051
wherein, the value range of the SBD (x, y) of the similarity is [0, 2%]0 denotes the two sequences are most similar, x and y denote the two sample sequences, maxwRepresent the calculation such that
Figure BDA0003276863040000052
Maximum w, CCw(x, y) denotes the cross-correlation sequence of x, y, R0(x, x) denotes the autocorrelation sequence of x, R0(y, y) denotes the autocorrelation sequence of y;
and 1022, clustering by using a k-shape clustering algorithm according to the obtained similarity, and determining the group number k by minimizing the sum of squared deviations in the clusters according to a formula 2 based on an inflection point method:
Figure BDA0003276863040000053
wherein SS represents the sum of squares of deviations reflecting the degree of deviation of the variable from the mathematical expectation, i represents the number of points in the cluster, n represents the number of points in the cluster, x represents the number of points in the clusteriIndicating the ith point in the cluster and,
Figure BDA0003276863040000054
represents the average of all points within a cluster;
and 1023, performing user electricity utilization curve cluster analysis based on a k-shape algorithm according to the group number k, firstly randomly distributing time sequences to each cluster, then calculating the mass center, refining members, calculating the distance between the members and each mass center, re-distributing the clusters to the members, updating the mass center, and repeating iteration until the algorithm converges or reaches the maximum iteration number.
103, extracting the characteristics of the power utilization behaviors of the user in a random forest mode, and determining the characteristic labels of the power utilization behaviors of the user; step 103 specifically comprises:
step 1031, setting an initial feature set, wherein the power utilization features representing the initial feature set specifically include: daily electricity consumption, daily maximum load, minimum load, daily average load, daily peak-valley difference, daily load rate, valley power coefficient, daily peak-valley difference rate, day-to-day working time characteristic, day-and-night electricity consumption characteristic, early peak characteristic, late peak characteristic, peak-hour electricity consumption rate, and percentage of electricity consumption in ordinary periods;
step 1032, extracting the characteristics of the power utilization behavior of the user by adopting a random forest method, evaluating the contribution of each characteristic on each tree in the random forest, averaging, and comparing the contribution of the characteristics by adopting a Gini index: according to equations 3-6, assume that there are C features X1,X2,X3,…,XCCalculating each feature Xj(ii) a kini index score of
Figure BDA0003276863040000061
That is, the average amount of change in node split purity in all decision trees for the jth feature:
Figure BDA0003276863040000062
wherein GI represents a Gini index, K represents K classes, and pmkRepresenting the proportion of the category k in the node m, namely, arbitrarily extracting two samples from the node m, wherein the category marks of the samples are inconsistentProbability of pmk’Representing the proportion of the class k in the node m, and taking all values of k' which are not equal to k;
Figure BDA0003276863040000063
wherein,
Figure BDA0003276863040000064
represents a feature XjImportance of node m, i.e. variation of Gini index, GI, before and after branching of node mlAnd GIrRespectively representing the Gini indexes of two new nodes after branching;
step 1033, if feature XjThe node appearing in decision tree i is set M, then XjThe importance in the ith tree is:
Figure BDA0003276863040000065
step 1034, assuming there are n trees in total, then:
Figure BDA0003276863040000071
step 1035, normalizing all the calculated importance scores according to formula 6:
Figure BDA0003276863040000072
wherein C represents the number of features.
And 104, acquiring a user electricity consumption behavior portrait according to the cluster of the user electricity consumption behaviors and the feature tag.
Step 104 specifically includes:
step 1041, analyzing the time characteristic of the user electricity consumption behavior by adopting a statistical method, and performing visual expression on the time characteristic of the user electricity consumption behavior through a histogram;
and 1042, analyzing the feature tags of different power consumption behaviors by adopting a scoring method, and performing visual expression on the feature tags through a radar map.
In step 1042, analyzing the feature labels of different electricity usage behaviors by a scoring method specifically includes:
the power utilization characteristics of different power utilization behaviors are represented through the scores of the tags, and the score of each tag of each type of power utilization mode is calculated according to a formula 8:
Figure BDA0003276863040000073
wherein G isi,jScoring the jth label for class i users,
Figure BDA0003276863040000074
is the average of the jth labels of all users in class i, gjmin、gjmaxThe maximum and minimum values of the jth label.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a k-shape clustering algorithm-based user electricity consumption behavior clustering analysis method and a random forest-based feature extraction method to construct a user multi-dimensional electricity consumption feature label system, which can be used for analyzing the user electricity consumption features by combining with the actual resident user electricity consumption data and realizing the visualization, intuition and convenience of the user electricity consumption behavior portrait.
The above-described technical means of the embodiments of the present invention will be described in detail below.
In order to solve the problems described in the background art, an embodiment of the present invention provides a method for representing a user electricity consumption behavior based on electricity consumption characteristic analysis, as shown in fig. 2, the method specifically includes:
step 1(S101), preprocessing the collected hourly electricity consumption data, including data cleansing and normalization processing.
Step 2(S102), a method for performing cluster analysis on the electricity utilization behavior of the user based on a k-shape clustering algorithm specifically comprises the following steps:
and 2.1, calculating the similarity. The k-shape algorithm can be free from the influence of sequence scaling and movement by adopting a distance measure based on cross correlation. Cross-correlation measure cross-correlation is a statistical measure that we can use to determine the similarity of two sequences, x and y, even if they are not correctly aligned. To achieve translational invariance, the y sequence is kept constant while computing the cross-correlation, and x is slid over y, computing the inner product of each shift s of x. Here, we define the shape-based distance (SBD) calculation formula as follows:
Figure BDA0003276863040000081
the range is [0,2], 0 indicating that the two sequences are most similar.
And 2.2, clustering by adopting a k-shape method, and determining the grouping number k by using an inflection point method. The idea is to minimize the sum of squared deviations within a cluster. (the sum of squared deviations tends to stabilize (the fluctuation is less than the threshold) as clusters increase, and the location of the inflection point change in the k value (the point at which the slope changes) can be observed by plotting the sum of squared deviations for each cluster against the number of clusters.) the deviation reflects the degree to which the variable deviates from the mathematical expectation, the sum of squares of which is:
Figure BDA0003276863040000082
and 2.3, carrying out user electricity utilization curve cluster analysis based on a k-shape algorithm according to the group number k determined in the step 2.2.
Further, the k-shape algorithm in the invention mainly comprises two stages of distribution and refinement. The method comprises the steps of firstly randomly distributing time sequences to each cluster, then calculating a centroid, then refining members, calculating the distance from the members to each centroid, re-distributing the clusters to the members and updating the centroids. And repeating the iteration until the algorithm converges or the maximum iteration number is reached.
The clustering analysis process is as follows:
(1) first, the number of groups k is determined by the inflection point method, and the position where the inflection point of the k value changes (the point where the slope changes) is observed by plotting a line graph (as shown in fig. 3) between the sum of squared deviations of each cluster and the number of clusters. The dispersion reflects the degree to which the variable deviates from the mathematical expectation, with the sum of the squares:
Figure BDA0003276863040000091
selecting different grouping numbers k, then calculating the sum theta of the shape distances from all samples in the cluster to the class center of the class to which the samples belong under different k values, drawing a curve with the horizontal axis being k and the vertical axis being theta by using all the k values and the corresponding theta values, selecting a point with the maximum change degree of the absolute value of the slope of the curve, wherein the corresponding k value is the grouping number to be selected. The shape-based distance (SBD) calculation formula is defined as follows:
Figure BDA0003276863040000092
wherein, x and y represent two sample sequences, the value range of SBD is [0,2], and 0 represents the most similar of the two sequences.
(2) And then, according to the k value obtained in the step (1), clustering analysis is carried out on the electricity load data of the user by adopting a k-shape algorithm, and the clustering result is shown in fig. 4. Typical power usage pattern behavior profile analysis is shown in table 1.
TABLE 1
Figure BDA0003276863040000093
Step 3(S103), extracting a plurality of key feature labels from a plurality of commonly used user electricity utilization features by using a random forest algorithm, and establishing a user multi-dimensional electricity utilization feature label system, wherein the specific steps can be as follows:
and 3.1, setting an initial feature set. The initial feature set is represented by 14 commonly used electrical features, including: daily electricity consumption, daily maximum load, minimum load, daily average load, daily peak-valley difference, daily load rate, valley power coefficient, daily peak-valley difference rate, day-to-day working time characteristic, day-to-night electricity consumption characteristic, early peak characteristic, late peak characteristic, peak-to-peak electricity consumption rate, and percentage of electricity consumed at ordinary times.
And 3.2, extracting features by using a random forest.
Further, the idea of feature importance assessment using random forests is to assess how much each feature contributes to each tree in the random forests, then take an average, and finally compare the contribution between the features. Here, the Gini index (Gini index) is used as an evaluation index to measure [ 11%]. We denote the variable importance scores (variable importance measures) by VIM and the Gini indices by GI, assuming that there are m features X1,X2,X3,…,XCNow, each feature X is calculatedj(ii) a kini index score of
Figure BDA0003276863040000101
I.e., the average amount of change in node split purity in all decision trees of the RF for the jth feature. The formula for calculating the Gini index is
Figure BDA0003276863040000102
Wherein K represents K categories, pmkIndicating the proportion of class k in node m. That is, two samples are arbitrarily extracted from the node m, and the probability that the class labels thereof are not coincident is obtained. Characteristic XjThe importance of the node m, i.e., the variation of the Gini indices before and after the node m branches, is
Figure BDA0003276863040000103
Wherein GIlAnd GIrRespectively representing the kini indexes of two new nodes after branching. If, feature XjThe node appearing in decision tree i is set M, then XjThe importance in the ith tree is:
Figure BDA0003276863040000104
assuming that the RF has n trees in total, then
Figure BDA0003276863040000105
And finally, performing normalization processing on all the obtained importance scores.
Figure BDA0003276863040000111
Step 3.3, a normalization process is performed on the importance evaluation of all the characteristics obtained in step 3.2.
In one example of the embodiment of the present invention, the specific process of the feature extraction process is as follows:
(1) an initial feature set was constructed using 14 commonly used electricity usage features, including: daily total load, daily maximum load, daily minimum load, daily average load, daily peak-valley difference, daily load rate, valley power coefficient, daily peak-valley difference rate, daytime electricity consumption characteristic, nighttime electricity consumption characteristic, early peak characteristic, late peak characteristic, peak-hour electricity consumption rate, and percentage of electricity consumption at ordinary times.
(2) And evaluating the feature importance by using a random forest algorithm. The Gini index (Gini index) was used as an evaluation index. We denote the variable importance scores (variable importance measures) by VIM and the Gini indices by GI, assuming that there are m features X1,X2,X3,…,XCNow, each feature X is calculatedj(ii) a kini index score of
Figure BDA0003276863040000112
I.e., the average amount of change in node split purity in all decision trees of the RF for the jth feature. The formula for calculating the Gini index is
Figure BDA0003276863040000113
Wherein K represents K categories, pmkIndicating the proportion of class k in node m. That is, two samples are arbitrarily extracted from the node m, and the probability that the class labels thereof are not coincident is obtained. Characteristic XjThe importance of the node m, i.e., the variation of the Gini indices before and after the node m branches, is
Figure BDA0003276863040000114
Wherein GIlAnd GIrRespectively representing the kini indexes of two new nodes after branching. If, feature XjThe node appearing in decision tree i is set M, then XjThe importance in the ith tree is
Figure BDA0003276863040000115
Assuming that the RF has n trees in total, then
Figure BDA0003276863040000116
The feature importance normalization calculation formula is as follows:
Figure BDA0003276863040000121
the results of feature importance evaluation are shown in table 2, and in this example, 4 features with the highest importance are selected: daily electricity consumption (1), daily average load (4), daily peak-to-valley rate (8) and peak-to-peak power consumption rate (13).
TABLE 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Score of 0.09 0.07 0.04 0.1 0.08 0.065 0.055 0.085 0.07 0.05 0.08 0.07 0.085 0.075
Step 4(S104), applying the clustered dividing results and the characteristic labels to carry out comprehensive portrait and visual presentation on the power utilization behaviors of the user, wherein the specific steps can be as follows:
and 4.1, analyzing the time characteristics of the electricity utilization behavior of the user by adopting a statistical method, and visualizing through a histogram.
And 4.2, analyzing the feature labels of different power consumption behavior modes by adopting a grading method, and visualizing through a radar map.
Furthermore, the invention adopts a scoring system, which is full of 5 points, and different characteristics need to be unified due to different units of different characteristics. The type of power utilization characteristics are characterized by the scores of the tags, and the score of each tag in each type of power utilization mode is obtained according to the following formula:
Figure BDA0003276863040000122
in the formula, Gi,jScoring the jth label for class i users,
Figure BDA0003276863040000123
is the average of the jth labels of all users in class i, gjmin、gjmaxThe maximum and minimum values of the jth label.
In one example, the specific process of the behavior portrayal process is as follows:
(1) the electricity usage patterns obtained by the cluster analysis were statistically analyzed by day of week, as shown in fig. 5.
(2) And carrying out feature selection on the electricity utilization modes obtained by the clustering analysis.
The type of power usage characteristics are characterized by a score for the tags, the score for each tag for each type of power usage pattern being given by:
Figure BDA0003276863040000131
in the formula, Gi,jScoring the jth label for class i users,
Figure BDA0003276863040000132
is the average of the jth labels of all users in class i, gjmin、gjmaxThe maximum and minimum values of the jth label.
And visually displaying the scored result to obtain the user electricity consumption behavior portrait, as shown in fig. 6. Typical consumer electricity usage behavior profile profiles are shown in table 3.
TABLE 3
Portrait of power consumption behavior Class A Class B Class C
Daily electricity consumption In Is low in Height of
Peak-to-valley rate of day Height of Is low in Height of
Average daily load Is low in Is low in Height of
Peak power consumption rate Height of Is low in Is low in
In summary, the embodiment of the present invention provides a method for representing a user electricity consumption behavior based on electricity consumption characteristic analysis, which is based on k-shape user electricity consumption behavior cluster analysis to improve clustering performance; a random forest algorithm is introduced to select the power utilization characteristics of the user, a characteristic set with the most obvious influence effect is screened from the original characteristic sets, and a multi-dimensional power utilization characteristic label system of the user is established; and displaying the user behavior portrait by a visualization mode of a histogram and a radar chart based on the clustering result and the feature label. The embodiment of the invention can realize effective clustering and extraction of the power utilization characteristics of the user, thereby constructing the power utilization behavior portrait of the user.
Apparatus embodiment one
According to an embodiment of the present invention, there is provided a user power consumption behavior representation apparatus based on power consumption characteristic analysis, fig. 7 is a schematic diagram of the user power consumption behavior representation apparatus based on power consumption characteristic analysis according to an embodiment of the present invention, as shown in fig. 7, the user power consumption behavior representation apparatus based on power consumption characteristic analysis according to an embodiment of the present invention includes a preprocessing module, a clustering module, a feature module and a representation module, so that power consumption characteristic analysis and behavior representation of a user can be analyzed, specifically,
a preprocessing module 70, configured to preprocess the collected load data;
the clustering module 72 is used for performing clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm and performing clustering division on the power utilization behaviors of the users;
the feature extraction module 74 is configured to extract features of the user power consumption behavior in a random forest manner, and determine a feature label of the user power consumption behavior; the feature extraction module 74 is specifically configured to:
setting an initial feature set, wherein the electricity utilization features representing the initial feature set specifically comprise: daily electricity consumption, daily maximum load, minimum load, daily average load, daily peak-valley difference, daily load rate, valley power coefficient, daily peak-valley difference rate, day-to-day working time characteristic, day-and-night electricity consumption characteristic, early peak characteristic, late peak characteristic, peak-hour electricity consumption rate, and percentage of electricity consumption in ordinary periods;
extracting the characteristics of the power utilization behavior of the user by adopting a random forest method, evaluating the contribution of each characteristic on each tree in the random forest, averaging, and comparing the contribution of the characteristics by adopting a Gini index: according to equations 3-6, assume that there are C features X1,X2,X3,…,XCCalculating each feature Xj(ii) a kini index score of
Figure BDA0003276863040000141
That is, the average amount of change in node split purity in all decision trees for the jth feature:
Figure BDA0003276863040000142
wherein GI represents a Gini index, K represents K classes, and pmkRepresenting the proportion of the class k in the node m, i.e. the probability of inconsistency of the class labels, p, of two samples arbitrarily taken from the node mmk’Representing the proportion of the class k in the node m, and taking all values of k' which are not equal to k;
Figure BDA0003276863040000143
wherein,
Figure BDA0003276863040000144
good indication of characteristic XjImportance of node m, i.e. variation of Gini index, GI, before and after branching of node mlAnd GIrRespectively representing the Gini indexes of two new nodes after branching;
if the feature XjThe node appearing in decision tree i is set M, then XjThe importance in the ith tree is:
Figure BDA0003276863040000145
assuming a total of n trees, then:
Figure BDA0003276863040000146
all the calculated importance scores are normalized according to formula 6:
Figure BDA0003276863040000151
wherein C represents the number of features.
And the power utilization behavior portrait module 76 is used for acquiring a user power utilization behavior portrait according to the cluster of the user power utilization behaviors and the feature tag. The power consumption behavior representation module 76 is specifically configured to:
analyzing the time characteristic of the user electricity consumption behavior by adopting a statistical method, and performing visual expression on the time characteristic of the user electricity consumption behavior through a histogram;
analyzing feature labels of different power consumption behaviors by adopting a grading method, and performing visual expression on the feature labels through a radar map;
the clustering module is specifically configured to;
similarity calculation is performed according to equation 1:
Figure BDA0003276863040000152
wherein, the value range of the SBD (x, y) of the similarity is [0, 2%]0 denotes the two sequences are most similar, x and y denote the two sample sequences, maxwRepresent the calculation such that
Figure BDA0003276863040000153
Maximum w, CCw(x, y) denotes the cross-correlation sequence of x, y, R0(x, x) denotes the autocorrelation sequence of x, R0(y, y) denotes the autocorrelation sequence of y;
and according to the obtained similarity, clustering by adopting a k-shape clustering algorithm, and determining the grouping number k by minimizing the sum of squared deviations in the clusters according to a formula 2 and based on an inflection point method:
Figure BDA0003276863040000154
wherein SS represents the sum of squares of deviations reflecting the degree of deviation of the variable from the mathematical expectation, i represents the number of points in the cluster, n represents the number of points in the cluster, x represents the number of points in the clusteriIndicating the ith point in the cluster and,
Figure BDA0003276863040000155
represents the average of all points within a cluster;
according to the grouping number k, carrying out user electricity utilization curve cluster analysis based on a k-shape algorithm, firstly randomly distributing time sequences to each cluster, then calculating the mass center, thinning members, calculating the distance between the members and each mass center, re-distributing the clusters to the members and updating the mass center, and repeatedly iterating until the algorithm converges or the maximum iteration number is reached;
the power consumption behavior representation module 76 is specifically configured to:
the power utilization characteristics of different power utilization behaviors are represented through the scores of the tags, and the score of each tag of each type of power utilization mode is calculated according to a formula 8:
Figure BDA0003276863040000161
wherein G isi,jScoring the jth label for class i users,
Figure BDA0003276863040000162
is the average of the jth labels of all users in class i, gjmin、gjmaxThe maximum and minimum values of the jth label.
The embodiment of the present invention is an apparatus embodiment corresponding to the above method embodiment, and specific operations of each module may be understood with reference to the description of the method embodiment, which is not described herein again.
Device embodiment II
An embodiment of the present invention provides a user power consumption behavior representation apparatus based on power consumption characteristic analysis, as shown in fig. 8, including: a memory 80, a processor 82 and a computer program stored on the memory 80 and executable on the processor 82, which computer program when executed by the processor 82 performs the steps as described in the method embodiments.
Device embodiment III
An embodiment of the present invention provides a computer-readable storage medium, on which an implementation program for information transmission is stored, and when being executed by a processor 82, the program implements the steps described in the method embodiment.
The computer-readable storage medium of this embodiment includes, but is not limited to: ROM, RAM, magnetic or optical disks, and the like.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In the 30 s of the 20 th century, improvements in a technology could clearly be distinguished between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the units may be implemented in the same software and/or hardware or in multiple software and/or hardware when implementing the embodiments of the present description.
One skilled in the art will recognize that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
One or more embodiments of the present description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of this document and is not intended to limit this document. Various modifications and changes may occur to those skilled in the art from this document. Any modifications, equivalents, improvements, etc. which come within the spirit and principle of the disclosure are intended to be included within the scope of the claims of this document.

Claims (10)

1. A user electricity consumption behavior portrait method based on electricity consumption characteristic analysis is characterized by comprising the following steps:
preprocessing the collected load data;
performing clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm, and performing clustering division on the power utilization behaviors of the users;
extracting the characteristics of the power utilization behaviors of the users in a random forest mode to determine the characteristic labels of the power utilization behaviors of the users;
and acquiring the user electricity consumption behavior portrait according to the cluster of the user electricity consumption behavior and the feature tag.
2. The method according to claim 1, wherein the obtaining the representation of the power consumption behavior of the user according to the cluster of the power consumption behavior of the user and the feature tag specifically comprises:
analyzing the time characteristic of the user electricity consumption behavior by adopting a statistical method, and performing visual expression on the time characteristic of the user electricity consumption behavior through a histogram;
and analyzing the feature labels of different power consumption behaviors by adopting a grading method, and performing visual expression on the feature labels through a radar map.
3. The method of claim 1, wherein the pre-processed load data is subjected to clustering analysis by using a k-shape clustering algorithm, and the clustering division of the user electricity utilization behavior specifically comprises:
similarity calculation is performed according to equation 1:
Figure FDA0003276863030000011
wherein, the value range of the SBD (x, y) of the similarity is [0, 2%]0 denotes the two sequences are most similar, x and y denote the two sample sequences, maxwRepresent the calculation such that
Figure FDA0003276863030000012
Maximum w, CCw(x, y) denotes the cross-correlation sequence of x, y, R0(x, x) denotes the autocorrelation sequence of x, R0(y, y) denotes the autocorrelation sequence of y;
and according to the obtained similarity, clustering by adopting a k-shape clustering algorithm, and determining the grouping number k by minimizing the sum of squared deviations in the clusters according to a formula 2 and based on an inflection point method:
Figure FDA0003276863030000021
wherein SS represents the sum of squares of deviations reflecting the degree of deviation of the variable from the mathematical expectation, i represents the number of points in the cluster, and n represents the number of points in the cluster,xiIndicating the ith point in the cluster and,
Figure FDA0003276863030000022
represents the average of all points within a cluster;
according to the grouping number k, carrying out user electricity utilization curve cluster analysis based on a k-shape algorithm, firstly randomly distributing time sequences to each cluster, then calculating the mass center, thinning members, calculating the distance between the members and each mass center, re-distributing the clusters to the members and updating the mass center, and repeating iteration until the algorithm converges or the maximum iteration number is reached.
4. The method as claimed in claim 1, wherein the determining the feature label of the user power consumption behavior by extracting the feature of the user power consumption behavior in a random forest manner specifically comprises:
setting an initial feature set, wherein the electricity utilization features representing the initial feature set specifically comprise: daily electricity consumption, daily maximum load, minimum load, daily average load, daily peak-valley difference, daily load rate, valley power coefficient, daily peak-valley difference rate, day-to-day working time characteristic, day-and-night electricity consumption characteristic, early peak characteristic, late peak characteristic, peak-hour electricity consumption rate, and percentage of electricity consumption in ordinary periods;
extracting the characteristics of the power utilization behavior of the user by adopting a random forest method, evaluating the contribution of each characteristic on each tree in the random forest, averaging, and comparing the contribution of the characteristics by adopting a Gini index: according to equations 3-6, assume that there are C features X1,X2,X3,…,XCCalculating each feature Xj(ii) a kini index score of
Figure FDA0003276863030000023
That is, the average amount of change in node split purity in all decision trees for the jth feature:
Figure FDA0003276863030000024
wherein GI represents a Gini index, K represents K classes, and pmkRepresenting the proportion of the class k in the node m, i.e. the probability of inconsistency of the class labels, p, of two samples arbitrarily taken from the node mmk’Representing the proportion of the class k in the node m, and taking all values of k' which are not equal to k;
Figure FDA0003276863030000031
wherein,
Figure FDA0003276863030000032
represents a feature XjImportance of node m, i.e. variation of Gini index, GI, before and after branching of node mlAnd GIrRespectively representing the Gini indexes of two new nodes after branching;
if the feature XjThe node appearing in decision tree i is set M, then XjThe importance in the ith tree is:
Figure FDA0003276863030000033
assuming a total of n trees, then:
Figure FDA0003276863030000034
all the calculated importance scores are normalized according to formula 6:
Figure FDA0003276863030000035
wherein C represents the number of features.
5. The method according to claim 2, wherein analyzing the signature labels of different electricity usage behaviors by a scoring method specifically comprises:
the power utilization characteristics of different power utilization behaviors are represented through the scores of the tags, and the score of each tag of each type of power utilization mode is calculated according to a formula 8:
Figure FDA0003276863030000036
wherein G isi,jScoring the jth label for class i users,
Figure FDA0003276863030000037
is the average of the jth labels of all users in class i, gjmin、gjmaxThe maximum and minimum values of the jth label.
6. A user power consumption behavior imaging device based on power consumption characteristic analysis is characterized by comprising:
the preprocessing module is used for preprocessing the collected load data;
the clustering module is used for carrying out clustering analysis on the preprocessed load data by adopting a k-shape clustering algorithm and carrying out clustering division on the power utilization behaviors of the users;
the characteristic extraction module is used for extracting the characteristics of the power utilization behaviors of the users in a random forest mode and determining the characteristic labels of the power utilization behaviors of the users;
and the power consumption behavior portrait module is used for acquiring the user power consumption behavior portrait according to the cluster of the user power consumption behaviors and the feature tag.
7. The apparatus of claim 6,
the power consumption behavior portrait module is specifically used for:
analyzing the time characteristic of the user electricity consumption behavior by adopting a statistical method, and performing visual expression on the time characteristic of the user electricity consumption behavior through a histogram;
analyzing feature labels of different power consumption behaviors by adopting a grading method, and performing visual expression on the feature labels through a radar map;
the clustering module is specifically configured to;
similarity calculation is performed according to equation 1:
Figure FDA0003276863030000041
wherein, the value range of the SBD (x, y) of the similarity is [0, 2%]0 denotes the two sequences are most similar, x and y denote the two sample sequences, maxwRepresent the calculation such that
Figure FDA0003276863030000042
Maximum w, CCw(x, y) denotes the cross-correlation sequence of x, y, R0(x, x) denotes the autocorrelation sequence of x, R0(y, y) denotes the autocorrelation sequence of y;
and according to the obtained similarity, clustering by adopting a k-shape clustering algorithm, and determining the grouping number k by minimizing the sum of squared deviations in the clusters according to a formula 2 and based on an inflection point method:
Figure FDA0003276863030000043
wherein SS represents the sum of squares of deviations reflecting the degree of deviation of the variable from the mathematical expectation, i represents the number of points in the cluster, n represents the number of points in the cluster, x represents the number of points in the clusteriIndicating the ith point in the cluster and,
Figure FDA0003276863030000051
represents the average of all points within a cluster;
according to the grouping number k, carrying out user electricity utilization curve cluster analysis based on a k-shape algorithm, firstly randomly distributing time sequences to each cluster, then calculating the mass center, thinning members, calculating the distance between the members and each mass center, re-distributing the clusters to the members and updating the mass center, and repeatedly iterating until the algorithm converges or the maximum iteration number is reached;
the feature extraction module is specifically configured to:
setting an initial feature set, wherein the electricity utilization features representing the initial feature set specifically comprise: daily electricity consumption, daily maximum load, minimum load, daily average load, daily peak-valley difference, daily load rate, valley power coefficient, daily peak-valley difference rate, day-to-day working time characteristic, day-and-night electricity consumption characteristic, early peak characteristic, late peak characteristic, peak-hour electricity consumption rate, and percentage of electricity consumption in ordinary periods;
extracting the characteristics of the power utilization behavior of the user by adopting a random forest method, evaluating the contribution of each characteristic on each tree in the random forest, averaging, and comparing the contribution of the characteristics by adopting a Gini index: according to equations 3-6, assume that there are C features X1,X2,X3,…,XCCalculating each feature Xj(ii) a kini index score of
Figure FDA0003276863030000052
That is, the average amount of change in node split purity in all decision trees for the jth feature:
Figure FDA0003276863030000053
wherein GI represents a Gini index, K represents K classes, and pmkRepresenting the proportion of the class k in the node m, i.e. the probability of inconsistency of the class labels, p, of two samples arbitrarily taken from the node mmk’Representing the proportion of the class k in the node m, and taking all values of k' which are not equal to k;
Figure FDA0003276863030000054
wherein,
Figure FDA0003276863030000055
represents a feature XjImportance of node m, i.e. variation of Gini index, GI, before and after branching of node mlAnd GIrRespectively representing the Gini indexes of two new nodes after branching;
if the feature XjThe node appearing in decision tree i is set M, then XjThe importance in the ith tree is:
Figure FDA0003276863030000061
assuming a total of n trees, then:
Figure FDA0003276863030000062
all the calculated importance scores are normalized according to formula 6:
Figure FDA0003276863030000063
wherein C represents the number of features.
8. The apparatus of claim 7, wherein the representation module is specifically configured to:
the power utilization characteristics of different power utilization behaviors are represented through the scores of the tags, and the score of each tag of each type of power utilization mode is calculated according to a formula 8:
Figure FDA0003276863030000064
wherein G isi,jScoring the jth label for class i users,
Figure FDA0003276863030000065
is the average of the jth labels of all users in class i, gjmin、gjmaxThe maximum and minimum values of the jth label.
9. A user power consumption behavior imaging device based on power consumption characteristic analysis is characterized by comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program when executed by the processor implementing the steps of the power usage behavior profiling method for a user based on power usage characteristics analysis according to any one of claims 1 to 5.
10. A computer-readable storage medium, wherein the computer-readable storage medium stores thereon an implementation program for information transfer, and the program, when executed by a processor, implements the steps of the power consumption behavior profiling method for users based on power consumption characteristic analysis according to any one of claims 1 to 5.
CN202111120375.8A 2021-09-24 2021-09-24 User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis Pending CN113837274A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111120375.8A CN113837274A (en) 2021-09-24 2021-09-24 User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111120375.8A CN113837274A (en) 2021-09-24 2021-09-24 User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis

Publications (1)

Publication Number Publication Date
CN113837274A true CN113837274A (en) 2021-12-24

Family

ID=78969833

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111120375.8A Pending CN113837274A (en) 2021-09-24 2021-09-24 User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis

Country Status (1)

Country Link
CN (1) CN113837274A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114169523A (en) * 2022-02-10 2022-03-11 一道新能源科技(衢州)有限公司 Solar cell use data analysis method and system
CN115168437A (en) * 2022-09-06 2022-10-11 国网冀北综合能源服务有限公司 Method and system for realizing portrait of electricity user based on data analysis
CN115809406A (en) * 2023-02-03 2023-03-17 佰聆数据股份有限公司 Power consumer fine-grained classification method, device, equipment and storage medium
CN116662629A (en) * 2023-08-02 2023-08-29 杭州宇谷科技股份有限公司 Charging curve retrieval method, system, device and medium based on time sequence clustering

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437053A (en) * 2020-11-10 2021-03-02 国网北京市电力公司 Intrusion detection method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437053A (en) * 2020-11-10 2021-03-02 国网北京市电力公司 Intrusion detection method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
荀港益: "基于聚类分析与随机森林的短期负荷滚动预测", 智能城市, 30 September 2018 (2018-09-30), pages 9 - 11 *
赵晋泉等: "电力用户用电特征选择与行为画像", 电网技术, 30 September 2020 (2020-09-30), pages 3488 - 3496 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114169523A (en) * 2022-02-10 2022-03-11 一道新能源科技(衢州)有限公司 Solar cell use data analysis method and system
CN115168437A (en) * 2022-09-06 2022-10-11 国网冀北综合能源服务有限公司 Method and system for realizing portrait of electricity user based on data analysis
CN115809406A (en) * 2023-02-03 2023-03-17 佰聆数据股份有限公司 Power consumer fine-grained classification method, device, equipment and storage medium
CN116662629A (en) * 2023-08-02 2023-08-29 杭州宇谷科技股份有限公司 Charging curve retrieval method, system, device and medium based on time sequence clustering
CN116662629B (en) * 2023-08-02 2024-05-28 杭州宇谷科技股份有限公司 Charging curve retrieval method, system, device and medium based on time sequence clustering

Similar Documents

Publication Publication Date Title
CN113837274A (en) User electricity consumption behavior portrait method and device based on electricity consumption characteristic analysis
US11043808B2 (en) Method for identifying pattern of load cycle
Liu et al. Shared-nearest-neighbor-based clustering by fast search and find of density peaks
Kumar et al. An efficient k-means clustering filtering algorithm using density based initial cluster centers
Ferreira et al. Time series clustering via community detection in networks
Aghabozorgi et al. Stock market co-movement assessment using a three-phase clustering method
Rajabi et al. A pattern recognition methodology for analyzing residential customers load data and targeting demand response applications
Chakraborty et al. Simultaneous variable weighting and determining the number of clusters—A weighted Gaussian means algorithm
CN101615248A (en) Age estimation method, equipment and face identification system
CN111144468A (en) Power consumer information labeling method and device, electronic equipment and storage medium
Chicco et al. Renyi entropy-based classification of daily electrical load patterns
Afzalan et al. An automated spectral clustering for multi-scale data
WO2017015672A1 (en) Topological data analysis for identification of market regimes for prediction
Yang et al. Portfolio optimization based on empirical mode decomposition
CN110728313A (en) Classification model training method and device for intention classification recognition
CN113837635A (en) Risk detection processing method, device and equipment
Chen et al. Deep subspace image clustering network with self-expression and self-supervision
Taghizadeh et al. How meaningful are similarities in deep trajectory representations?
Williams Clustering household electricity use profiles
US20230402846A1 (en) Data analysis system and method
Yuan et al. Irmac: Interpretable refined motifs in binary classification for smart grid applications
CN114118624A (en) Power demand response potential evaluation method, device, equipment and storage medium
Rodríguez-Gómez et al. A novel clustering based method for characterizing household electricity consumption profiles
Panapakidis et al. Deriving the optimal number of clusters in the electricity consumer segmentation procedure
Dahal Effect of different distance measures in result of cluster analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination