CN116304295A - User energy consumption portrait analysis method based on multivariate data driving - Google Patents
User energy consumption portrait analysis method based on multivariate data driving Download PDFInfo
- Publication number
- CN116304295A CN116304295A CN202211630066.XA CN202211630066A CN116304295A CN 116304295 A CN116304295 A CN 116304295A CN 202211630066 A CN202211630066 A CN 202211630066A CN 116304295 A CN116304295 A CN 116304295A
- Authority
- CN
- China
- Prior art keywords
- load curve
- user
- algorithm
- index
- indexes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005265 energy consumption Methods 0.000 title claims abstract description 40
- 238000004458 analytical method Methods 0.000 title claims abstract description 18
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 54
- 230000002776 aggregation Effects 0.000 claims abstract description 21
- 238000004220 aggregation Methods 0.000 claims abstract description 21
- 238000005457 optimization Methods 0.000 claims abstract description 21
- 239000002245 particle Substances 0.000 claims abstract description 21
- 230000006399 behavior Effects 0.000 claims abstract description 19
- 230000009467 reduction Effects 0.000 claims abstract description 13
- 238000002922 simulated annealing Methods 0.000 claims abstract description 12
- 238000007621 cluster analysis Methods 0.000 claims abstract description 9
- 238000003703 image analysis method Methods 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 45
- 238000011156 evaluation Methods 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 16
- 239000013598 vector Substances 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 230000006835 compression Effects 0.000 claims description 4
- 238000007906 compression Methods 0.000 claims description 4
- 230000005611 electricity Effects 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 230000009471 action Effects 0.000 claims description 2
- 238000012512 characterization method Methods 0.000 claims description 2
- 238000009826 distribution Methods 0.000 claims description 2
- 239000012634 fragment Substances 0.000 claims description 2
- 230000003993 interaction Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 230000006872 improvement Effects 0.000 claims 1
- 230000004044 response Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002902 bimodal effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Primary Health Care (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A user energy image analysis method based on multi-element data driving comprises the following steps: performing dimension reduction on the load curve by using a time sequence symbol aggregation approximate SAX algorithm and extracting features; converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm; according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm; and according to the clustering result, analyzing the energy utilization behaviors of various users. According to the current energy utilization state of the user side, a reasonable image information acquisition algorithm and an improved AP clustering algorithm are adopted to mine effective information in energy utilization data, a user energy consumption behavior portraits set is constructed, and the user energy consumption behavior portraits set is applied to multi-element energy utilization behavior analysis of the user, so that the energy utilization characteristics of the user are mastered.
Description
Technical Field
The invention relates to an analysis method, in particular to a user energy portrait analysis method based on multivariate data driving and application thereof.
Background
The user side resources are generally utilized in three modes of peak clipping, valley filling and accurate real-time load control, so that the investment of a power system can be slowed down, the balance of the load of a source network can be kept, new energy consumption can be promoted, and the risk of environmental accidents can be resisted. The intelligent energy taking the big data technology as the core can better grasp the user demands, can reasonably distribute the energy, ensure to meet the daily production life of the user, pay more attention to the experience of the user, and realize the complementary advantages among individual users. The method constructs a new model for changing the energy data into the public value of society, reasonably adjusts the energy supply and demand, and helps the industry to upgrade and the civilian development.
In the aspect of demand response of a user side, a plurality of domestic and foreign experts are used for developing the research, and make important contributions in discussing the demand optimization problem of the user side, so that the problems of weak participation will of the user, poor economy of the user side project, immature business mode and the like are discovered. In terms of user behavior feature analysis methods, common methods for data extraction include PCA evolutionary transformation methods, k-means algorithms, and the like. The technologies are analyzed by domestic and foreign experts, wherein the PCA evolution technology can realize mass analysis, save key data of original data, reduce dimensionality and improve clustering quality; the K-means method is simple and convenient, the success rate of clustering is good, and the expandability is strong. At present, scientific research mainly focuses on data analysis of comprehensive energy utilization behaviors of clients, but research and development of a data analysis model for comprehensive energy utilization behaviors of end users from the view of integration capability are still in an exploration stage. In order to effectively solve the problem of the current integrated energy system user side, the blank of the research in the direction is made up.
Disclosure of Invention
In order to solve the defects in the prior art, the invention discloses a user energy portrait analysis method based on multi-element data driving, which has the following technical scheme:
a method for analyzing user energy image based on multi-element data driving is characterized in that: the method comprises the following steps:
step 1: performing dimension reduction on the load curve by using a time sequence symbol aggregation approximate SAX algorithm and extracting features;
step 3: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;
step 4: and according to the clustering result, analyzing the energy utilization behaviors of various users.
The invention also discloses a nonvolatile storage medium, which is characterized in that the nonvolatile storage medium comprises a stored program, wherein the program controls equipment where the nonvolatile storage medium is located to execute the method when running.
The invention also discloses an electronic device which is characterized by comprising a processor and a memory; the memory has stored therein computer readable instructions, the processor is configured to execute the computer readable instructions, wherein the computer readable instructions execute the method described above when executed
The invention also discloses a user energy portrait analysis device based on the multi-element data driving, which is characterized in that: the device comprises the following modules:
the dimension reduction feature extraction module: the method is used for reducing the dimension of the load curve and extracting the characteristics by utilizing a time sequence symbol aggregation approximate SAX algorithm;
the simulated annealing particle swarm algorithm module is used for converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm;
and a cluster analysis module: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;
and the energy consumption analysis module for various users: according to the clustering result, analyzing the energy consumption behaviors of various users
Advantageous effects
According to the current energy utilization state of the user side, a reasonable image information acquisition algorithm and an improved AP clustering algorithm are adopted to mine effective information in energy utilization data, the effective information is applied to multi-element energy utilization behavior analysis of the user, energy utilization characteristics of the user are mastered, and a user energy consumption behavior portrait set is constructed.
Drawings
FIG. 1 is a flow chart of an improved AP clustering algorithm of the present invention.
FIG. 2 is a graph of a user dataset cluster center of the present invention.
Detailed Description
Example 1
The invention discloses a user energy image analysis method based on multi-element data driving, which comprises the following steps:
(1) Time sequence symbol aggregation approximation method based on particle swarm optimization
(1.1) principle of time-series symbol aggregation approximation algorithm
The time sequence symbol aggregation approximation (SAX) is a method for representing a continuous time sequence by using a symbolization method, and is a method for converting the time sequence into a character string, and the method has a better dimension reduction effect on a high-dimension sequence. The method comprises the following specific steps:
step one: will nThe dimensional time sequence is converted into a vector with w dimensions, and the original load curve X= [ X ] 1 ,x 2 K x n ]Using piecewise aggregation approximation, the data is piecewise approximated as w segmentsWherein the i->The calculation formula of (2) is as follows:
dividing the n-dimensional original time sequence vector into w segments to reduce to w-dimensional and x j Is the original load curve column vector;is the mean of the ith fragment; />Is the compression ratio.
Step two: the sequence data obtained through the Piecewise Aggregated Approximation (PAA) is symbolized to achieve each time series normalization, which is then converted into a Piecewise Aggregated Approximation (PAA) representation.
Wherein,,is a subcolumn element of length n; alpha j Is the i-th element in the alphabet; beta j-1 、β j The j-1 and j probability values in the Gaussian distribution breakpoint list are respectively corresponding.
Step three: after the time sequence is dimension reduced, the problem of missing report easily occurs in the characteristic space inquiry. The following definition theory is applied to ensure no report missing, n-dimensional time sequences C and Q are converted into w-dimensional vectors in SAX, PAA expression is obtained, and a dimension reduction formula is substituted into Euclidean distance to obtain a distance measurement formula of the PAA:
wherein,,q, C time series after dimension reduction, < >>Respectively-> Is the i-th element of (c). Further converting the data into a symbolic representation, defining a MINDIST function that returns the minimum distance between the original time sequences of the two words as:
step four: there is an optimization direction, namely improving the lower bound compactness (Tightness of Lower Bound, TLB), expressed herein as:
d (Q, C) represents the euclidean distance of the time series Q and C. Obviously, the TLB takes a value between 0 and 1, the closer the value is to 1, the closer the lower bound distance is to the true distance measure, i.e., the smaller the error.
(1.2) simulated annealing particle swarm algorithm
The particle swarm optimization algorithm is an optimization algorithm with a global optimization function based on a group. The optimal value is searched by adopting an iterative method, the system is initialized to a group of random solutions, and particles (potential solutions) are used for searching the optimal particle swarm in the solutions, but the particle swarm optimization method can generate a local extreme point phenomenon, so that the defects of slow convergence in the later period of evolution, poor precision and the like exist. In order to solve the problems of optimization calculation of the traditional particle swarm, a particle swarm algorithm based on simulated annealing is adopted, the algorithm maintains the unique global optimizing technology of the traditional particle swarm algorithm, is simple and convenient, and can effectively avoid the problems that the particle swarm algorithm falls into local extreme points and the like.
Based on a simulated annealing particle swarm algorithm, the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression is converted into a multi-objective optimization problem, and the objective function is as follows:
wherein:
2≤l≤l m (10)
2≤w≤w m (11)
wherein A is accuracy, and represents the characterization function of the segmented load curve to the original load curve; e is information quantity, the information entropy is used for measuring, the smaller the information entropy is, the greater the accuracy is when the existing signal is used for prediction, and the greater the information quantity is contained; r is reduction rate and reflects the compression degree of the original load curve.Values approximated for the section of the load curve PPA +.>And the original load curve X i Is related to (a)Coefficients. Due to the different dimensions +.>After spline interpolation, form a spline with X i And (3) carrying out correlation coefficient calculation on the sequences with the same dimension: p is p i For character i at X i The occurrence probability of (a) is determined; l (L) m Is the maximum number of characters, w m For the set maximum number of segments, take l herein m =w m μ is a weight coefficient for two parameterizations, here μ=0.5.
And (3) evaluating the algorithm effect through three indexes A, R, E, and when the comprehensive effect is optimal, obtaining the optimal load curve expression.
(2) Description of user energy consumption characteristic index based on optimized time sequence symbol aggregation approximation algorithm and AP clustering algorithm (2.1) of energy consumption characteristic index
In the process of processing the user energy data, a proper and proper feature extraction technology is adopted, so that an effective operation result can be ensured, and the calculated amount can be reduced. When data mining is carried out, the method has more definite physical significance on the data to be acquired, so that the method can help power enterprises to better study and process related data, and early warning, abnormal data analysis, demand side management and the like are realized by analyzing the energy consumption data. Meanwhile, through the key data characteristics acquired from the demand side, the discrete characteristics and the time domain characteristics acquired by utilizing a time sequence symbol aggregation approximation technology are combined, the dimension of the load curve is reduced, so that the internal meaning of the load curve is more efficiently and intuitively analyzed, and the load curve is more completely evaluated.
The user energy consumption characteristic index is a reflection of the internal rule of the load curve, and can rapidly and efficiently extract useful information in the high-dimensional load curve. The method comprises the steps of introducing 3 typical energy utilization characteristic indexes, namely energy utilization load level, energy utilization stability and energy utilization interaction capacity, selecting specific indexes comprising daily average load, daily load rate, peak-time energy consumption rate, valley electricity coefficient and the like as characteristic vectors, and clustering load curves. And taking the index as a main data feature vector, comprehensively reflecting the time domain and state characteristics of the load curve according to the discrete characteristics of SAX optimization, and taking the index as a clustering basis of the load curve. The index selections are shown in table 1.
Table 1 comprehensive energy system user energy performance index
(2.2) CRITIC weighting method
In order to avoid subjectivity of user energy utilization characteristic index setting, a CRITIC weighting method is adopted to evaluate contribution of each characteristic index to a clustering result, and the index weight of energy utilization characteristics is objectively determined. The basic idea is to comprehensively measure the objective weight of the index according to the contrast strength of the evaluation index and the conflict between indexes. Wherein the contrast intensity refers to the mean square error idea and characterizes the variability of the evaluation index. I.e. the larger the mean square value, the larger the amount of information the index contains; the conflict represents the relevance among different indexes, and if the correlation coefficient of 2 indexes is larger, the relevance is stronger, and the corresponding conflict is lower.
The CRITIC weighting method comprises the following specific steps of:
1) And (5) index normalization processing. And setting m evaluation objects and n evaluation indexes, and normalizing the different indexes by adopting a forward/reverse normalization method in view of different action trends of the different indexes on the final evaluation result.
The forward index is as shown in (12):
the reverse index is shown as (13):
wherein: i=1, 2,. -%, m;j=1,2,...,n;a ij a j-th index actual value representing an i-th user; b ij And (5) representing the j-th index value of the i-th user after normalization.
2) And calculating the correlation coefficient of the evaluation index matrix. The correlation coefficient can describe the conflict between the indexes, and if the two indexes have obvious positive correlation, the smaller the conflict is, the lower the weight is. The correlation coefficient is calculated as shown in formula (14):
wherein: i=1, 2,. -%, n; j=1, 2,. -%, n; r is (r) ij Is the correlation coefficient between the ith index and the jth index.
3) Weights are calculated. The contrast intensity and the conflict of each evaluation index are calculated by using the obtained correlation coefficient matrix, as shown in the formula (15):
wherein: j=1, 2,. -%, n; sigma (sigma) j Is the correlation coefficient between the ith index and the jth index.For the contrast intensity of the j-th index, +.>And a quantization index indicating the conflict between the jth index and other indexes. Based on the contrast intensity and the conflict of the indexes, the information quantity size contained in the indexes is calculated as shown in a formula (16):
wherein G is j The larger the value is, the larger the information contained in the j index is, and the larger the weighting is.
Objective weight W of final jth index j The method comprises the following steps:
(2.3) improving the AP clustering Algorithm
The AP clustering algorithm has the advantages of no need of specifying the number of clusters, quadratic error and minimum error of the clustering result, and the like, but the complexity of the algorithm is higher. In processing multidimensional data, a long time of calculation is often required. Therefore, the method improves the calculation speed of the AP clustering similarity matrix by selecting the discrete state quantity of the load curve and reducing the dimension of the load curve by using the energy characteristic index, and adjusts the deviation parameter so as to improve the clustering efficiency.
1) Improving similarity matrix
s(i,j)=-[αd dij +(1-α)d tij ]i≠j (18)
Wherein s (i, j) is an element that improves the similarity matrix; d, d dij And d tij The distance between the discrete state characteristic d and the energy utilization characteristic index t of the load curve i and the load curve j after SAX calculation is respectively represented by the Euclidean distance; alpha is a characteristic weight coefficient.
2) Improving deflection parameters
The element value s (i, i) on the main diagonal of the similarity matrix is a bias parameter, and the value of the element value s (i, i) is related to the number of clustering results. Reasonable deviation parameter values are selected by using the clustering evaluation indexes, so that the iteration times of the algorithm can be effectively reduced, and the clustering precision is improved.
The AP clustering algorithm has good stability and small index range variation for multiple iterative clustering effect evaluation (DB). Therefore, DB index is used as a bias parameter selection and convergence criterion of the AP clustering algorithm, as shown in the formula.
s(i,i)=p m +δDB min (20)
Wherein p is m An initial value of the median of all numbers on the non-main diagonal; DB (database) min DB minimum value under the calculation of the current algorithm; delta is a search threshold, delta > 0 represents a forward search, delta < 0 is a backward search; as shown in (21), the smaller the value of the DB index calculation is, the lower the similarity between classes is, and the better the clustering effect is.
Wherein n is a cluster number; w (W) i 、W j Respectively, i and j-th class data points are respectively sent to a clustering center C j Average distance of (2); c (C) ij Is the distance between cluster centers i and j.
The flow of the improved AP clustering algorithm is shown in fig. 1.
(3) Calculation case analysis
The section selects user data of a certain comprehensive energy system park, 2000 load curves are randomly selected from the user data, and initial energy utilization characteristic index weights are processed by adopting equal weights. After solving and optimizing by adopting a particle swarm algorithm based on simulated annealing, the optimal segmentation number w=3 and the optimal character number l=6 are obtained. The final cluster center obtained by adopting the optimized AP cluster algorithm is of 4 types, as shown in figure 2:
as can be seen from FIG. 2, the load curves have large differences, the energy consumption of various typical users is obviously changed, and each cluster center represents the energy consumption of one type of users. As can be seen from FIG. 2, the load curves have large differences, the energy consumption of various typical users is obviously changed, and each cluster center represents the energy consumption of one type of users. The class A users have larger energy consumption level in the morning and evening, have obvious fall back in the noon, and possibly belong to office workers; B. the energy consumption level of the class C users is improved after 7 points and is lowered after 20 points, and the energy consumption behavior accords with the daily work and rest rules of most residents; the energy consumption level of the B class users is average, the energy consumption level of the morning and evening is slightly larger, the characteristics of continuous energy consumption are presented, but no obvious peak-valley characteristics exist; the daytime energy consumption level of the class C users is higher than that of the class B users, and the class C users respectively have two peaks in the midday and the evening, which belongs to bimodal load; class D users use low energy levels due in large part to equipment loss and possibly due to non-electricity-consuming residents throughout the day, such as empty room customers, business travelers, etc. And according to the extracted load characteristics, the user energy utilization behavior can be deeply analyzed.
Class D users use too low a level of energy and are therefore not analyzed. The method evaluates the energy consumption levels of three A, B, C users respectively, the daily peak-valley difference of the A-class users is large, peak clipping and valley filling are needed, and the method is a potential group for demand response. B. The class C users have higher daily load rate, can be used as resident demand response representatives, and can formulate higher peak-hour electricity prices for the class C users, guide the class C users to execute peak clipping and valley filling, and promote the optimal configuration of power resources. In addition, the peak regulation capability of the class B users is larger, the daily energy consumption is more stable, and the system can be matched with the class D users to carry out scheduling and arrangement so as to fill the load valley.
The characteristic index of the clustering center is shown in table 2, and the corresponding initial weight and the improved final weight of A are shown in table 3. To simplify the analysis, the cluster center is used as a representative load on the load curve. As shown in table 3, the daily average load weight was highest, and it was mainly considered in the analysis.
TABLE 2 clustering center characteristic index
Table 3 initial weights and update results
Meanwhile, according to the discrete state characteristics of each representative load, the CRITIC weighting method can be utilized to analyze the energy consumption characteristics, and the qualitative analysis of the energy consumption characteristic indexes is combined to further analyze the demand response potential of the user. According to the formula (16) of the CRITIC weighting method, the larger the information amount contained in the index is, the larger the weight is; the conflict represents the relevance among different indexes, and the relevance coefficient is used for representing the relevance among the indexes, so that the stronger the relevance among the indexes and other indexes is, the smaller the conflict among the indexes and other indexes is, the more the same information is reflected, the more the repeated the embodied evaluation content is, the evaluation strength of the indexes is weakened to a certain extent, and the weight distributed to the indexes is reduced. Therefore, it can be considered that a user with a large amount of information is suitable for price type demand response, and a user with a small amount of information is suitable for incentive type demand response. Assuming that the correlation coefficient is unchanged, the larger the collision, i.e., the standard deviation, the larger the amount of information contained. The index conflict calculations for each user are shown in table 4. The information quantity contained by the B-class users is larger, the overall energy utilization level is average, the B-class users are suitable for being used as price type demand response clients, and flexible electricity prices are formulated to guide the users to change energy utilization behaviors; and the class A and class C users have higher energy and smaller information content, can be used as motivation type demand response clients, and reduce the power demand when the system needs or the power is in tension by combining the satisfaction degree of different users.
TABLE 4 index conflict for each user
User' s | Index conflict |
A | 59.91 |
B | 89.81 |
C | 46.16 |
Example two
Based on the same inventive concept, the present application further provides a nonvolatile storage medium, where the nonvolatile storage medium includes a stored program, and the program controls a device where the nonvolatile storage medium is located to execute the method in the first embodiment.
Example III
Based on the same inventive concept, the application also provides an electronic device, which comprises a processor and a memory; the memory stores computer readable instructions, and the processor is configured to execute the computer readable instructions, where the computer readable instructions execute the method in the first embodiment.
Example IV
Based on the same inventive concept, the application also provides a user energy figure analysis device based on multi-element data driving, which comprises the following modules:
the dimension reduction feature extraction module: the method is used for reducing the dimension of the load curve and extracting the characteristics by utilizing a time sequence symbol aggregation approximate SAX algorithm;
the simulated annealing particle swarm algorithm module is used for converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm;
and a cluster analysis module: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;
and the energy consumption analysis module for various users: and according to the clustering result, analyzing the energy utilization behaviors of various users.
In conclusion, the algorithm can not only efficiently and accurately cluster the load curves, but also extract important features of the load curves, and is beneficial to analysis of user behavior. An improved AP clustering algorithm based on SAX discrete state features and weighted energy utilization characteristic indexes is provided, and objective weights of the energy utilization characteristic indexes are determined by using a CRITI C weighting method. The computing case demonstrates that the extracted features not only can ensure the clustering precision, but also can be helpful for analyzing the user energy consumption behavior. Can be popularized and applied to various occasions such as demand response and the like.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (10)
1. A method for analyzing user energy image based on multi-element data driving is characterized in that: the method comprises the following steps:
step 1: performing dimension reduction on the load curve by using a time sequence symbol aggregation approximate SAX algorithm and extracting features;
step 2, converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm;
step 3: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;
step 4: and according to the clustering result, analyzing the energy utilization behaviors of various users.
2. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the step 1 further comprises the following steps:
step one: converting the n-dimensional time sequence into a w-dimensional vector, and converting the original load curve X= [ X ] 1 ,x 2 K x n ]Using piecewise aggregation approximation, the data is piecewise approximated as w segmentsWherein the i->The calculation formula of (2) is as follows:
dividing the n-dimensional original time sequence vector into w segments to reduce to w-dimensional and x j Is the original load curve column vector;is the mean of the ith fragment; />Is the compression ratio.
Step two: the sequence data obtained through the Piecewise Aggregation Approximation (PAA) is subjected to character to realize normalization of each time sequence, and then the normalized time sequence is converted into the Piecewise Aggregation Approximation (PAA) representation;
wherein,,is a subcolumn element of length n; alpha j Is the i-th element in the alphabet; beta j-1 、β j The probability values are the j-1 and j corresponding to the Gaussian distribution breakpoint list;
step three: after the time sequence is dimension reduced, the problem of missing report easily occurs in the characteristic space inquiry; the following definition theory is applied to ensure no report missing, n-dimensional time sequences C and Q are converted into w-dimensional vectors in SAX, PAA expression is obtained, and a dimension reduction formula is substituted into Euclidean distance to obtain a distance measurement formula of the PAA:
3. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the step 2 further comprises the following steps:
the objective function is as follows:
wherein:
2≤l≤l m (10)
2≤w≤w m (11) Wherein A is accuracy, and represents the characterization function of the segmented load curve to the original load curve; e is information quantity, the information entropy is used for measuring, the smaller the information entropy is, the greater the accuracy is when the existing signal is used for prediction, and the greater the information quantity is contained; r is a reduction rate and reflects the compression degree of an original load curve;values approximated for the section of the load curve PPA +.>And the original load curve X i Is a correlation coefficient of (2); due to the different dimensions +.>After spline interpolation, form a spline with X i And (3) carrying out correlation coefficient calculation on the sequences with the same dimension: p is p i For character i at X i The occurrence probability of (a) is determined; l (L) m Is the maximum number of characters, w m For the set maximum number of segments, take l herein m =w m =10, μ is the weight coefficient of two parameterizations.
4. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the step 3 further comprises the following steps: 3 types of typical energy utilization characteristic indexes, namely energy utilization load level, energy utilization stability and energy utilization interaction capability, are introduced, specific indexes including daily average load, daily load rate, peak time energy consumption rate, valley electricity coefficient and the like are selected as characteristic vectors, load curves are clustered, the indexes are used as main data characteristic vectors, and according to SAX optimized discrete characteristics, time domain and state characteristics of the load curves are comprehensively reflected and are used as clustering basis of the load curves.
5. The method for analyzing the user-friendly image based on the multi-element data driving according to claim 4, wherein the method comprises the following steps: the contribution of each characteristic index to the clustering result is evaluated by using a CRITIC weighting method, and the index weight of the energy consumption characteristic is objectively determined, wherein the objective weight of the index is comprehensively measured according to the contrast intensity of the evaluation index and the conflict between indexes, and the contrast intensity characterizes the difference of the evaluation indexes: i.e. the larger the mean square value, the larger the amount of information the index contains; the conflict represents the relevance among different indexes, and if the correlation coefficient of 2 indexes is larger, the relevance is stronger, and the corresponding conflict is lower.
6. The method for analyzing the user-friendly image based on the multi-element data driving according to claim 5, wherein the method comprises the following steps: the CRITIC weighting method comprises the following specific steps of:
1) Index normalization: setting m evaluation objects and n evaluation indexes, and normalizing the different indexes by adopting a forward/reverse normalization method in view of different action trends of the different indexes on the final evaluation result;
2) Calculating the correlation coefficient of the evaluation index matrix: the correlation coefficient can describe the conflict among the indexes, and if the two indexes have obvious positive correlation, the smaller the conflict is, the lower the weight is;
3) Calculating weights: and calculating the contrast strength and the conflict of each evaluation index by using the obtained correlation coefficient matrix.
7. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the improved AP clustering algorithm further includes the following;
1) The discrete state quantity of the load curve and the energy consumption characteristic index are selected to reduce the dimension of the load curve, so that the calculation speed of the AP clustering similarity matrix is improved, and the deviation parameters are adjusted to improve the clustering efficiency:
s(i,j)=-[αd dij +(1-α)d tij ]i≠j (18)
wherein d dij And d tij The distance between the discrete state characteristic d and the energy utilization characteristic index t of the load curve i and the load curve j after SAX calculation is respectively represented by the Euclidean distance; alpha is a characteristic weight coefficient;
2) Improvement of bias parameters: the element value s (i, i) on the main diagonal of the similarity matrix is a deflection parameter, and the value of the element value s (i, i) is related to the number of clustering results;
and using a clustering effect evaluation (DB) index as a bias parameter selection and convergence criterion of an AP clustering algorithm, wherein the bias parameter selection and convergence criterion is shown in the formula:
s(i,i)=p m +δDB min (20)
wherein p is m The median of all numbers on the non-main diagonal is the initial value; DB (database) min DB minimum value under the calculation of the current algorithm; delta is a search threshold, if the search is to be carried out forwards, delta is greater than 0, otherwise delta is less than 0, and DB index calculation is shown as (21):
wherein n is a cluster number; w (W) i For data points within class i to cluster center C j Average distance of (2); c (C) ij Is the distance between cluster centers i and j.
8. A non-volatile storage medium, characterized in that the non-volatile storage medium comprises a stored program, wherein the program, when run, controls a device in which the non-volatile storage medium is located to perform the method of any one of claims 1 to 7.
9. An electronic device comprising a processor and a memory; the memory has stored therein computer readable instructions for executing the processor, wherein the computer readable instructions when executed perform the method of any of claims 1 to 7.
10. A user is with can figure analytical equipment based on many data drive, characterized by: the device comprises the following modules:
the dimension reduction feature extraction module: the method is used for reducing the dimension of the load curve and extracting the characteristics by utilizing a time sequence symbol aggregation approximate SAX algorithm;
the simulated annealing particle swarm algorithm module is used for converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm;
and a cluster analysis module: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;
and the energy consumption analysis module for various users: and according to the clustering result, analyzing the energy utilization behaviors of various users.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211630066.XA CN116304295A (en) | 2022-12-19 | 2022-12-19 | User energy consumption portrait analysis method based on multivariate data driving |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211630066.XA CN116304295A (en) | 2022-12-19 | 2022-12-19 | User energy consumption portrait analysis method based on multivariate data driving |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116304295A true CN116304295A (en) | 2023-06-23 |
Family
ID=86834805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211630066.XA Pending CN116304295A (en) | 2022-12-19 | 2022-12-19 | User energy consumption portrait analysis method based on multivariate data driving |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116304295A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117076990A (en) * | 2023-10-13 | 2023-11-17 | 国网浙江省电力有限公司 | Load curve identification method, device and medium based on curve dimension reduction and clustering |
-
2022
- 2022-12-19 CN CN202211630066.XA patent/CN116304295A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117076990A (en) * | 2023-10-13 | 2023-11-17 | 国网浙江省电力有限公司 | Load curve identification method, device and medium based on curve dimension reduction and clustering |
CN117076990B (en) * | 2023-10-13 | 2024-02-27 | 国网浙江省电力有限公司 | Load curve identification method, device and medium based on curve dimension reduction and clustering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Load profiling and its application to demand response: A review | |
CN112561156A (en) | Short-term power load prediction method based on user load mode classification | |
CN108805213B (en) | Power load curve double-layer spectral clustering method considering wavelet entropy dimensionality reduction | |
CN113112090B (en) | Space load prediction method based on principal component analysis of comprehensive mutual informativity | |
CN116304295A (en) | User energy consumption portrait analysis method based on multivariate data driving | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN116307059A (en) | Power distribution network region fault prediction model construction method and device and electronic equipment | |
CN115660855A (en) | Stock closing price prediction method fusing news data | |
CN117151770A (en) | Attention mechanism-based LSTM carbon price prediction method and system | |
CN117390550A (en) | Low-carbon park carbon emission dynamic prediction method and system considering emission training set | |
CN116561569A (en) | Industrial power load identification method based on EO feature selection and AdaBoost algorithm | |
Obst et al. | Textual data for time series forecasting | |
CN116151464A (en) | Photovoltaic power generation power prediction method, system and storable medium | |
CN113780686A (en) | Distributed power supply-oriented virtual power plant operation scheme optimization method | |
CN110852628A (en) | Rural medium and long term load prediction method considering development mode influence | |
CN111353523A (en) | Method for classifying railway customers | |
Wang et al. | Analysis of user’s power consumption behavior based on k-means | |
Mougeot et al. | Forecasting intra day load curves using sparse functional regression | |
CN113673579B (en) | Small sample-based electricity load classification algorithm | |
Li et al. | Research on power customer segmentation based on big data of intelligent city | |
Guan et al. | Stock prediction via time series clustering and image feature extraction | |
CN115271274B (en) | Short-term daily load prediction method for power system and related equipment | |
CN117670066B (en) | Questor management method, system, equipment and storage medium based on intelligent decision | |
CN118133063A (en) | User electricity behavior feature analysis method and system based on demand response | |
Lou | Massive Ship Fault Data Retrieval Algorithm Supporting Complex Query in Cloud Computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |