CN116304295A

CN116304295A - User energy consumption portrait analysis method based on multivariate data driving

Info

Publication number: CN116304295A
Application number: CN202211630066.XA
Authority: CN
Inventors: 窦真兰; 张春雁; 孙沛; 王永利; 滕越; 周含芷; 袁博; 王一诺
Original assignee: North China Electric Power University; State Grid Shanghai Electric Power Co Ltd
Current assignee: North China Electric Power University; State Grid Shanghai Electric Power Co Ltd
Priority date: 2022-12-19
Filing date: 2022-12-19
Publication date: 2023-06-23

Abstract

A user energy image analysis method based on multi-element data driving comprises the following steps: performing dimension reduction on the load curve by using a time sequence symbol aggregation approximate SAX algorithm and extracting features; converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm; according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm; and according to the clustering result, analyzing the energy utilization behaviors of various users. According to the current energy utilization state of the user side, a reasonable image information acquisition algorithm and an improved AP clustering algorithm are adopted to mine effective information in energy utilization data, a user energy consumption behavior portraits set is constructed, and the user energy consumption behavior portraits set is applied to multi-element energy utilization behavior analysis of the user, so that the energy utilization characteristics of the user are mastered.

Description

User energy consumption portrait analysis method based on multivariate data driving

Technical Field

The invention relates to an analysis method, in particular to a user energy portrait analysis method based on multivariate data driving and application thereof.

Background

The user side resources are generally utilized in three modes of peak clipping, valley filling and accurate real-time load control, so that the investment of a power system can be slowed down, the balance of the load of a source network can be kept, new energy consumption can be promoted, and the risk of environmental accidents can be resisted. The intelligent energy taking the big data technology as the core can better grasp the user demands, can reasonably distribute the energy, ensure to meet the daily production life of the user, pay more attention to the experience of the user, and realize the complementary advantages among individual users. The method constructs a new model for changing the energy data into the public value of society, reasonably adjusts the energy supply and demand, and helps the industry to upgrade and the civilian development.

In the aspect of demand response of a user side, a plurality of domestic and foreign experts are used for developing the research, and make important contributions in discussing the demand optimization problem of the user side, so that the problems of weak participation will of the user, poor economy of the user side project, immature business mode and the like are discovered. In terms of user behavior feature analysis methods, common methods for data extraction include PCA evolutionary transformation methods, k-means algorithms, and the like. The technologies are analyzed by domestic and foreign experts, wherein the PCA evolution technology can realize mass analysis, save key data of original data, reduce dimensionality and improve clustering quality; the K-means method is simple and convenient, the success rate of clustering is good, and the expandability is strong. At present, scientific research mainly focuses on data analysis of comprehensive energy utilization behaviors of clients, but research and development of a data analysis model for comprehensive energy utilization behaviors of end users from the view of integration capability are still in an exploration stage. In order to effectively solve the problem of the current integrated energy system user side, the blank of the research in the direction is made up.

Disclosure of Invention

In order to solve the defects in the prior art, the invention discloses a user energy portrait analysis method based on multi-element data driving, which has the following technical scheme:

a method for analyzing user energy image based on multi-element data driving is characterized in that: the method comprises the following steps:

step 1: performing dimension reduction on the load curve by using a time sequence symbol aggregation approximate SAX algorithm and extracting features;

step 2, optimizing the extracted features by using an annealing particle swarm algorithm;

step 3: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;

step 4: and according to the clustering result, analyzing the energy utilization behaviors of various users.

The invention also discloses a nonvolatile storage medium, which is characterized in that the nonvolatile storage medium comprises a stored program, wherein the program controls equipment where the nonvolatile storage medium is located to execute the method when running.

The invention also discloses an electronic device which is characterized by comprising a processor and a memory; the memory has stored therein computer readable instructions, the processor is configured to execute the computer readable instructions, wherein the computer readable instructions execute the method described above when executed

The invention also discloses a user energy portrait analysis device based on the multi-element data driving, which is characterized in that: the device comprises the following modules:

the dimension reduction feature extraction module: the method is used for reducing the dimension of the load curve and extracting the characteristics by utilizing a time sequence symbol aggregation approximate SAX algorithm;

the simulated annealing particle swarm algorithm module is used for converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm;

and a cluster analysis module: according to the user energy consumption characteristic index, carrying out cluster analysis on the load curve by utilizing an improved AP cluster algorithm;

and the energy consumption analysis module for various users: according to the clustering result, analyzing the energy consumption behaviors of various users

Advantageous effects

According to the current energy utilization state of the user side, a reasonable image information acquisition algorithm and an improved AP clustering algorithm are adopted to mine effective information in energy utilization data, the effective information is applied to multi-element energy utilization behavior analysis of the user, energy utilization characteristics of the user are mastered, and a user energy consumption behavior portrait set is constructed.

Drawings

FIG. 1 is a flow chart of an improved AP clustering algorithm of the present invention.

FIG. 2 is a graph of a user dataset cluster center of the present invention.

Detailed Description

Example 1

The invention discloses a user energy image analysis method based on multi-element data driving, which comprises the following steps:

(1) Time sequence symbol aggregation approximation method based on particle swarm optimization

(1.1) principle of time-series symbol aggregation approximation algorithm

The time sequence symbol aggregation approximation (SAX) is a method for representing a continuous time sequence by using a symbolization method, and is a method for converting the time sequence into a character string, and the method has a better dimension reduction effect on a high-dimension sequence. The method comprises the following specific steps:

step one: will nThe dimensional time sequence is converted into a vector with w dimensions, and the original load curve X= [ X ] ₁ ,x ₂ K x _n ]Using piecewise aggregation approximation, the data is piecewise approximated as w segments

Wherein the i->

The calculation formula of (2) is as follows:

dividing the n-dimensional original time sequence vector into w segments to reduce to w-dimensional and x _j Is the original load curve column vector;

is the mean of the ith fragment; />

Is the compression ratio.

Step two: the sequence data obtained through the Piecewise Aggregated Approximation (PAA) is symbolized to achieve each time series normalization, which is then converted into a Piecewise Aggregated Approximation (PAA) representation.

Wherein,,

is a subcolumn element of length n; alpha _j Is the i-th element in the alphabet; beta _j-1 、β _j The j-1 and j probability values in the Gaussian distribution breakpoint list are respectively corresponding.

Step three: after the time sequence is dimension reduced, the problem of missing report easily occurs in the characteristic space inquiry. The following definition theory is applied to ensure no report missing, n-dimensional time sequences C and Q are converted into w-dimensional vectors in SAX, PAA expression is obtained, and a dimension reduction formula is substituted into Euclidean distance to obtain a distance measurement formula of the PAA:

wherein,,

q, C time series after dimension reduction, < >>

Respectively->

Is the i-th element of (c). Further converting the data into a symbolic representation, defining a MINDIST function that returns the minimum distance between the original time sequences of the two words as:

step four: there is an optimization direction, namely improving the lower bound compactness (Tightness of Lower Bound, TLB), expressed herein as:

d (Q, C) represents the euclidean distance of the time series Q and C. Obviously, the TLB takes a value between 0 and 1, the closer the value is to 1, the closer the lower bound distance is to the true distance measure, i.e., the smaller the error.

(1.2) simulated annealing particle swarm algorithm

The particle swarm optimization algorithm is an optimization algorithm with a global optimization function based on a group. The optimal value is searched by adopting an iterative method, the system is initialized to a group of random solutions, and particles (potential solutions) are used for searching the optimal particle swarm in the solutions, but the particle swarm optimization method can generate a local extreme point phenomenon, so that the defects of slow convergence in the later period of evolution, poor precision and the like exist. In order to solve the problems of optimization calculation of the traditional particle swarm, a particle swarm algorithm based on simulated annealing is adopted, the algorithm maintains the unique global optimizing technology of the traditional particle swarm algorithm, is simple and convenient, and can effectively avoid the problems that the particle swarm algorithm falls into local extreme points and the like.

Based on a simulated annealing particle swarm algorithm, the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression is converted into a multi-objective optimization problem, and the objective function is as follows:

wherein:

2≤l≤l _m (10)

2≤w≤w _m (11)

wherein A is accuracy, and represents the characterization function of the segmented load curve to the original load curve; e is information quantity, the information entropy is used for measuring, the smaller the information entropy is, the greater the accuracy is when the existing signal is used for prediction, and the greater the information quantity is contained; r is reduction rate and reflects the compression degree of the original load curve.

Values approximated for the section of the load curve PPA +.>

And the original load curve X _i Is related to (a)Coefficients. Due to the different dimensions +.>

After spline interpolation, form a spline with X _i And (3) carrying out correlation coefficient calculation on the sequences with the same dimension: p is p _i For character i at X _i The occurrence probability of (a) is determined; l (L) _m Is the maximum number of characters, w _m For the set maximum number of segments, take l herein _m ＝w _m μ is a weight coefficient for two parameterizations, here μ=0.5.

And (3) evaluating the algorithm effect through three indexes A, R, E, and when the comprehensive effect is optimal, obtaining the optimal load curve expression.

(2) Description of user energy consumption characteristic index based on optimized time sequence symbol aggregation approximation algorithm and AP clustering algorithm (2.1) of energy consumption characteristic index

In the process of processing the user energy data, a proper and proper feature extraction technology is adopted, so that an effective operation result can be ensured, and the calculated amount can be reduced. When data mining is carried out, the method has more definite physical significance on the data to be acquired, so that the method can help power enterprises to better study and process related data, and early warning, abnormal data analysis, demand side management and the like are realized by analyzing the energy consumption data. Meanwhile, through the key data characteristics acquired from the demand side, the discrete characteristics and the time domain characteristics acquired by utilizing a time sequence symbol aggregation approximation technology are combined, the dimension of the load curve is reduced, so that the internal meaning of the load curve is more efficiently and intuitively analyzed, and the load curve is more completely evaluated.

The user energy consumption characteristic index is a reflection of the internal rule of the load curve, and can rapidly and efficiently extract useful information in the high-dimensional load curve. The method comprises the steps of introducing 3 typical energy utilization characteristic indexes, namely energy utilization load level, energy utilization stability and energy utilization interaction capacity, selecting specific indexes comprising daily average load, daily load rate, peak-time energy consumption rate, valley electricity coefficient and the like as characteristic vectors, and clustering load curves. And taking the index as a main data feature vector, comprehensively reflecting the time domain and state characteristics of the load curve according to the discrete characteristics of SAX optimization, and taking the index as a clustering basis of the load curve. The index selections are shown in table 1.

Table 1 comprehensive energy system user energy performance index

(2.2) CRITIC weighting method

In order to avoid subjectivity of user energy utilization characteristic index setting, a CRITIC weighting method is adopted to evaluate contribution of each characteristic index to a clustering result, and the index weight of energy utilization characteristics is objectively determined. The basic idea is to comprehensively measure the objective weight of the index according to the contrast strength of the evaluation index and the conflict between indexes. Wherein the contrast intensity refers to the mean square error idea and characterizes the variability of the evaluation index. I.e. the larger the mean square value, the larger the amount of information the index contains; the conflict represents the relevance among different indexes, and if the correlation coefficient of 2 indexes is larger, the relevance is stronger, and the corresponding conflict is lower.

The CRITIC weighting method comprises the following specific steps of:

1) And (5) index normalization processing. And setting m evaluation objects and n evaluation indexes, and normalizing the different indexes by adopting a forward/reverse normalization method in view of different action trends of the different indexes on the final evaluation result.

The forward index is as shown in (12):

the reverse index is shown as (13):

wherein: i=1, 2,. -%, m;j＝1,2,...,n；a _ij a j-th index actual value representing an i-th user; b _ij And (5) representing the j-th index value of the i-th user after normalization.

2) And calculating the correlation coefficient of the evaluation index matrix. The correlation coefficient can describe the conflict between the indexes, and if the two indexes have obvious positive correlation, the smaller the conflict is, the lower the weight is. The correlation coefficient is calculated as shown in formula (14):

wherein: i=1, 2,. -%, n; j=1, 2,. -%, n; r is (r) _ij Is the correlation coefficient between the ith index and the jth index.

3) Weights are calculated. The contrast intensity and the conflict of each evaluation index are calculated by using the obtained correlation coefficient matrix, as shown in the formula (15):

wherein: j=1, 2,. -%, n; sigma (sigma) _j Is the correlation coefficient between the ith index and the jth index.

For the contrast intensity of the j-th index, +.>

And a quantization index indicating the conflict between the jth index and other indexes. Based on the contrast intensity and the conflict of the indexes, the information quantity size contained in the indexes is calculated as shown in a formula (16):

wherein G is _j The larger the value is, the larger the information contained in the j index is, and the larger the weighting is.

Objective weight W of final jth index _j The method comprises the following steps:

(2.3) improving the AP clustering Algorithm

The AP clustering algorithm has the advantages of no need of specifying the number of clusters, quadratic error and minimum error of the clustering result, and the like, but the complexity of the algorithm is higher. In processing multidimensional data, a long time of calculation is often required. Therefore, the method improves the calculation speed of the AP clustering similarity matrix by selecting the discrete state quantity of the load curve and reducing the dimension of the load curve by using the energy characteristic index, and adjusts the deviation parameter so as to improve the clustering efficiency.

1) Improving similarity matrix

s(i,j)＝-[αd _dij +(1-α)d _tij ]i≠j (18)

Wherein s (i, j) is an element that improves the similarity matrix; d, d _dij And d _tij The distance between the discrete state characteristic d and the energy utilization characteristic index t of the load curve i and the load curve j after SAX calculation is respectively represented by the Euclidean distance; alpha is a characteristic weight coefficient.

2) Improving deflection parameters

The element value s (i, i) on the main diagonal of the similarity matrix is a bias parameter, and the value of the element value s (i, i) is related to the number of clustering results. Reasonable deviation parameter values are selected by using the clustering evaluation indexes, so that the iteration times of the algorithm can be effectively reduced, and the clustering precision is improved.

The AP clustering algorithm has good stability and small index range variation for multiple iterative clustering effect evaluation (DB). Therefore, DB index is used as a bias parameter selection and convergence criterion of the AP clustering algorithm, as shown in the formula.

s(i,i)＝p _m +δDB _min (20)

Wherein p is _m An initial value of the median of all numbers on the non-main diagonal; DB (database) _min DB minimum value under the calculation of the current algorithm; delta is a search threshold, delta > 0 represents a forward search, delta < 0 is a backward search; as shown in (21), the smaller the value of the DB index calculation is, the lower the similarity between classes is, and the better the clustering effect is.

Wherein n is a cluster number; w (W) _i 、W _j Respectively, i and j-th class data points are respectively sent to a clustering center C _j Average distance of (2); c (C) _ij Is the distance between cluster centers i and j.

The flow of the improved AP clustering algorithm is shown in fig. 1.

(3) Calculation case analysis

The section selects user data of a certain comprehensive energy system park, 2000 load curves are randomly selected from the user data, and initial energy utilization characteristic index weights are processed by adopting equal weights. After solving and optimizing by adopting a particle swarm algorithm based on simulated annealing, the optimal segmentation number w=3 and the optimal character number l=6 are obtained. The final cluster center obtained by adopting the optimized AP cluster algorithm is of 4 types, as shown in figure 2:

as can be seen from FIG. 2, the load curves have large differences, the energy consumption of various typical users is obviously changed, and each cluster center represents the energy consumption of one type of users. As can be seen from FIG. 2, the load curves have large differences, the energy consumption of various typical users is obviously changed, and each cluster center represents the energy consumption of one type of users. The class A users have larger energy consumption level in the morning and evening, have obvious fall back in the noon, and possibly belong to office workers; B. the energy consumption level of the class C users is improved after 7 points and is lowered after 20 points, and the energy consumption behavior accords with the daily work and rest rules of most residents; the energy consumption level of the B class users is average, the energy consumption level of the morning and evening is slightly larger, the characteristics of continuous energy consumption are presented, but no obvious peak-valley characteristics exist; the daytime energy consumption level of the class C users is higher than that of the class B users, and the class C users respectively have two peaks in the midday and the evening, which belongs to bimodal load; class D users use low energy levels due in large part to equipment loss and possibly due to non-electricity-consuming residents throughout the day, such as empty room customers, business travelers, etc. And according to the extracted load characteristics, the user energy utilization behavior can be deeply analyzed.

Class D users use too low a level of energy and are therefore not analyzed. The method evaluates the energy consumption levels of three A, B, C users respectively, the daily peak-valley difference of the A-class users is large, peak clipping and valley filling are needed, and the method is a potential group for demand response. B. The class C users have higher daily load rate, can be used as resident demand response representatives, and can formulate higher peak-hour electricity prices for the class C users, guide the class C users to execute peak clipping and valley filling, and promote the optimal configuration of power resources. In addition, the peak regulation capability of the class B users is larger, the daily energy consumption is more stable, and the system can be matched with the class D users to carry out scheduling and arrangement so as to fill the load valley.

The characteristic index of the clustering center is shown in table 2, and the corresponding initial weight and the improved final weight of A are shown in table 3. To simplify the analysis, the cluster center is used as a representative load on the load curve. As shown in table 3, the daily average load weight was highest, and it was mainly considered in the analysis.

TABLE 2 clustering center characteristic index

Table 3 initial weights and update results

Meanwhile, according to the discrete state characteristics of each representative load, the CRITIC weighting method can be utilized to analyze the energy consumption characteristics, and the qualitative analysis of the energy consumption characteristic indexes is combined to further analyze the demand response potential of the user. According to the formula (16) of the CRITIC weighting method, the larger the information amount contained in the index is, the larger the weight is; the conflict represents the relevance among different indexes, and the relevance coefficient is used for representing the relevance among the indexes, so that the stronger the relevance among the indexes and other indexes is, the smaller the conflict among the indexes and other indexes is, the more the same information is reflected, the more the repeated the embodied evaluation content is, the evaluation strength of the indexes is weakened to a certain extent, and the weight distributed to the indexes is reduced. Therefore, it can be considered that a user with a large amount of information is suitable for price type demand response, and a user with a small amount of information is suitable for incentive type demand response. Assuming that the correlation coefficient is unchanged, the larger the collision, i.e., the standard deviation, the larger the amount of information contained. The index conflict calculations for each user are shown in table 4. The information quantity contained by the B-class users is larger, the overall energy utilization level is average, the B-class users are suitable for being used as price type demand response clients, and flexible electricity prices are formulated to guide the users to change energy utilization behaviors; and the class A and class C users have higher energy and smaller information content, can be used as motivation type demand response clients, and reduce the power demand when the system needs or the power is in tension by combining the satisfaction degree of different users.

TABLE 4 index conflict for each user

User' s	Index conflict
		A	59.91
B	89.81
		C	46.16

Example two

Based on the same inventive concept, the present application further provides a nonvolatile storage medium, where the nonvolatile storage medium includes a stored program, and the program controls a device where the nonvolatile storage medium is located to execute the method in the first embodiment.

Example III

Based on the same inventive concept, the application also provides an electronic device, which comprises a processor and a memory; the memory stores computer readable instructions, and the processor is configured to execute the computer readable instructions, where the computer readable instructions execute the method in the first embodiment.

Example IV

Based on the same inventive concept, the application also provides a user energy figure analysis device based on multi-element data driving, which comprises the following modules:

and the energy consumption analysis module for various users: and according to the clustering result, analyzing the energy utilization behaviors of various users.

In conclusion, the algorithm can not only efficiently and accurately cluster the load curves, but also extract important features of the load curves, and is beneficial to analysis of user behavior. An improved AP clustering algorithm based on SAX discrete state features and weighted energy utilization characteristic indexes is provided, and objective weights of the energy utilization characteristic indexes are determined by using a CRITI C weighting method. The computing case demonstrates that the extracted features not only can ensure the clustering precision, but also can be helpful for analyzing the user energy consumption behavior. Can be popularized and applied to various occasions such as demand response and the like.

The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for analyzing user energy image based on multi-element data driving is characterized in that: the method comprises the following steps:

step 2, converting the optimization problem of the load curve time sequence symbol aggregation approximation (SAX) expression into a multi-objective optimization problem based on a simulated annealing particle swarm algorithm;

2. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the step 1 further comprises the following steps:

step one: converting the n-dimensional time sequence into a w-dimensional vector, and converting the original load curve X= [ X ] ₁ ,x ₂ K x _n ]Using piecewise aggregation approximation, the data is piecewise approximated as w segments

Wherein the i->

The calculation formula of (2) is as follows:

is the mean of the ith fragment; />

Is the compression ratio.

Step two: the sequence data obtained through the Piecewise Aggregation Approximation (PAA) is subjected to character to realize normalization of each time sequence, and then the normalized time sequence is converted into the Piecewise Aggregation Approximation (PAA) representation;

wherein,,

is a subcolumn element of length n; alpha _j Is the i-th element in the alphabet; beta _j-1 、β _j The probability values are the j-1 and j corresponding to the Gaussian distribution breakpoint list;

step three: after the time sequence is dimension reduced, the problem of missing report easily occurs in the characteristic space inquiry; the following definition theory is applied to ensure no report missing, n-dimensional time sequences C and Q are converted into w-dimensional vectors in SAX, PAA expression is obtained, and a dimension reduction formula is substituted into Euclidean distance to obtain a distance measurement formula of the PAA:

wherein,,

q, C, respectively, are time sequences after dimension reduction; />

Respectively->

Is the i-th element of (c).

3. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the step 2 further comprises the following steps:

the objective function is as follows:

wherein:

2≤l≤l _m (10)

2≤w≤w _m (11) Wherein A is accuracy, and represents the characterization function of the segmented load curve to the original load curve; e is information quantity, the information entropy is used for measuring, the smaller the information entropy is, the greater the accuracy is when the existing signal is used for prediction, and the greater the information quantity is contained; r is a reduction rate and reflects the compression degree of an original load curve;

values approximated for the section of the load curve PPA +.>

And the original load curve X _i Is a correlation coefficient of (2); due to the different dimensions +.>

After spline interpolation, form a spline with X _i And (3) carrying out correlation coefficient calculation on the sequences with the same dimension: p is p _i For character i at X _i The occurrence probability of (a) is determined; l (L) _m Is the maximum number of characters, w _m For the set maximum number of segments, take l herein _m ＝w _m =10, μ is the weight coefficient of two parameterizations.

4. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the step 3 further comprises the following steps: 3 types of typical energy utilization characteristic indexes, namely energy utilization load level, energy utilization stability and energy utilization interaction capability, are introduced, specific indexes including daily average load, daily load rate, peak time energy consumption rate, valley electricity coefficient and the like are selected as characteristic vectors, load curves are clustered, the indexes are used as main data characteristic vectors, and according to SAX optimized discrete characteristics, time domain and state characteristics of the load curves are comprehensively reflected and are used as clustering basis of the load curves.

5. The method for analyzing the user-friendly image based on the multi-element data driving according to claim 4, wherein the method comprises the following steps: the contribution of each characteristic index to the clustering result is evaluated by using a CRITIC weighting method, and the index weight of the energy consumption characteristic is objectively determined, wherein the objective weight of the index is comprehensively measured according to the contrast intensity of the evaluation index and the conflict between indexes, and the contrast intensity characterizes the difference of the evaluation indexes: i.e. the larger the mean square value, the larger the amount of information the index contains; the conflict represents the relevance among different indexes, and if the correlation coefficient of 2 indexes is larger, the relevance is stronger, and the corresponding conflict is lower.

6. The method for analyzing the user-friendly image based on the multi-element data driving according to claim 5, wherein the method comprises the following steps: the CRITIC weighting method comprises the following specific steps of:

1) Index normalization: setting m evaluation objects and n evaluation indexes, and normalizing the different indexes by adopting a forward/reverse normalization method in view of different action trends of the different indexes on the final evaluation result;

2) Calculating the correlation coefficient of the evaluation index matrix: the correlation coefficient can describe the conflict among the indexes, and if the two indexes have obvious positive correlation, the smaller the conflict is, the lower the weight is;

3) Calculating weights: and calculating the contrast strength and the conflict of each evaluation index by using the obtained correlation coefficient matrix.

7. The multi-data-driven user-friendly image analysis method as claimed in claim 1, wherein: the improved AP clustering algorithm further includes the following;

1) The discrete state quantity of the load curve and the energy consumption characteristic index are selected to reduce the dimension of the load curve, so that the calculation speed of the AP clustering similarity matrix is improved, and the deviation parameters are adjusted to improve the clustering efficiency:

s(i,j)＝-[αd _dij +(1-α)d _tij ]i≠j (18)

wherein d _dij And d _tij The distance between the discrete state characteristic d and the energy utilization characteristic index t of the load curve i and the load curve j after SAX calculation is respectively represented by the Euclidean distance; alpha is a characteristic weight coefficient;

2) Improvement of bias parameters: the element value s (i, i) on the main diagonal of the similarity matrix is a deflection parameter, and the value of the element value s (i, i) is related to the number of clustering results;

and using a clustering effect evaluation (DB) index as a bias parameter selection and convergence criterion of an AP clustering algorithm, wherein the bias parameter selection and convergence criterion is shown in the formula:

s(i,i)＝p _m +δDB _min (20)

wherein p is _m The median of all numbers on the non-main diagonal is the initial value; DB (database) _min DB minimum value under the calculation of the current algorithm; delta is a search threshold, if the search is to be carried out forwards, delta is greater than 0, otherwise delta is less than 0, and DB index calculation is shown as (21):

wherein n is a cluster number; w (W) _i For data points within class i to cluster center C _j Average distance of (2); c (C) _ij Is the distance between cluster centers i and j.

8. A non-volatile storage medium, characterized in that the non-volatile storage medium comprises a stored program, wherein the program, when run, controls a device in which the non-volatile storage medium is located to perform the method of any one of claims 1 to 7.

9. An electronic device comprising a processor and a memory; the memory has stored therein computer readable instructions for executing the processor, wherein the computer readable instructions when executed perform the method of any of claims 1 to 7.

10. A user is with can figure analytical equipment based on many data drive, characterized by: the device comprises the following modules: