CN113902304A

CN113902304A - Controllable load screening method based on total load curve similarity

Info

Publication number: CN113902304A
Application number: CN202111187046.5A
Authority: CN
Inventors: 任守东; 于博; 贺欢; 高洋; 李正林; 何耀明; 贾依霖; 左超; 尚尔震; 张秀宇; 祝国强
Original assignee: Anshan Power Supply Co Of State Grid Liaoning Electric Power Co; State Grid Corp of China SGCC; Northeast Dianli University
Current assignee: Anshan Power Supply Co Of State Grid Liaoning Electric Power Co; State Grid Corp of China SGCC; Northeast Electric Power University
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2022-01-07

Abstract

The invention discloses a controllable load screening method based on total load curve similarity, which comprises the following steps of collecting power consumer electricity load data; calculating Euclidean distances and curve slope distances among the electric load data of different users; constructing a similarity matrix; clustering power users by using a spectral clustering algorithm and extracting a typical load curve of each cluster; calculating similarity indexes of the typical load curve and the total load curve of the system; determining several clusters of users with controllable potential based on the similarity index; collecting power utilization behavior influence factor information of power load users with controllable potential by using questionnaire survey and the like; and setting the controllable load capacity ratio of each cluster by referring to the similarity index and the influence factors of the power utilization behavior of the user. The invention has the beneficial effects that: the screening method provided by the invention can screen users with controllable potential in the power load users, and can distribute the controllable load capacity proportion according to the screening result.

Description

Controllable load screening method based on total load curve similarity

Technical Field

The invention relates to the technical field of network load interaction under a demand response condition, in particular to a controllable load screening method based on total load curve similarity.

Background

With the development of the power market and demand response technology and the pressure of realizing carbon peak reaching and carbon neutralization, it is urgent to improve the interaction efficiency of network load, reduce power generation loss and ensure the safe and stable operation of a power system. The new development situation puts higher requirements on the establishment of the grid load interaction strategy of the power system. An efficient network load interaction strategy is often established under the deep perception of load side information, and therefore higher requirements are provided for accurate classification of power load users and screening of controllable power load users. In recent years, the development of the smart grid and the wide application of the smart electric meter provide data support for further understanding the characteristics of the load side and identifying the user behavior mode. And the development of computer technologies such as data mining, intelligent algorithm, machine learning and the like provides a practical method for analyzing the behavior characteristics of the power load users. The method makes it possible to utilize historical operation data of users, apply data mining technology, obtain partial behavior characteristics of power load users by deeply interpreting historical data and user information, and provide reference for making network load interaction strategies.

The method for identifying the power consumption mode of the power load user through clustering of the load curve is a common method, the traditional clustering algorithm based on the Euclidean distance is simple in principle and easy to realize in operation, and good clustering effect can be achieved when the method is applied to clustering of the load curve. The disadvantage is that the similarity of curve forms and trends is lack of measurement, so that misallocation is easy to generate under some conditions, and the power consumption pattern of a user is identified incorrectly. In addition, in the analysis of the behavior characteristics of the power load users after load curve clustering, only perceptual evaluation is often performed, and the controllable load proportion is set simply by experience, so that convincing quantitative analysis is lacked. The judgment method is not enough for information mining of clustering results, and cannot effectively support subsequent work such as establishment of network load interaction strategies.

Disclosure of Invention

In order to overcome the defects in the background art, the invention provides a clustering method considering load curve form and trend similarity measurement, so as to solve the defect that the classification of a clustering algorithm based on Euclidean distance in the prior art is not accurate, and simultaneously provides a controllable power load user screening method based on the similarity of a user typical load characteristic curve and a system total load curve, so that the clustering result can be deeply analyzed, and the controllable load information contained in the load curve can be extracted.

In order to achieve the purpose, the invention adopts the following technical scheme:

a controllable load screening method based on total load curve similarity comprises the following steps:

step 1: collecting power load data of a power consumer;

step 2: calculating Euclidean distances and curve slope distances among the electric load data of different users;

and step 3: constructing a similarity matrix among the electric load data of each user;

and 4, step 4: clustering power users by using a spectral clustering algorithm based on the similarity matrix and extracting a typical load characteristic curve of each cluster of users;

and 5: calculating similarity indexes of the typical load characteristic curve and a total load curve of the system;

step 6: determining several clusters of users with controllable potential based on the similarity index;

and 7: collecting power utilization behavior influence factor information of power load users with controllable potential by using questionnaire survey and the like;

and 8: and setting the controllable load capacity ratio of each cluster by referring to the similarity index and the influence factors of the power utilization behavior of the user.

Further, in step 2, the calculation formula of the euclidean distance is as follows:

the data set X is an acquired m-dimensional electricity utilization characteristic vector data set of n electricity consumers, wherein X_iTime series X representing individual power consumer load data_i＝{(x_i,1,t₁),…,(x_i,l,t_l),…(x_i,m,t_m)}，(i＝1,2,…,n)，a_i,，jRepresenting two m-dimensional row vectors X in an arbitrary n X m-dimensional space_iAnd X_jThe similarity matrix formed by the real distances between the two sets of the reference points and based on the Euclidean distance is A.

Further, in step 2, the slope distance of the curve is calculated as:

time series X_i＝{(x_i,1,t₁),…,(x_i,l,t_l),…(x_i,m,t_m) The piecewise linear representation of is defined as:

wherein x_i,l-1，x_i,l(l 2,3, … m) respectively represents the start value and the end value of the ith straight line in the ith time sequence, t_lDenotes the time at which the l-th line ends, and m-1 denotes a time series

Dividing the number of straight line segments;

it is expressed in slope form as:

X’_i＝{(k_i,1,t_i,2),…,(k_i,l-1,t_i,l),…,(k_i,l-1,t_i,m)}

wherein k is_i,l-1Representing the slope of each segment line.

Calculating the slope distance of the curve among the characteristic vectors of each user, wherein the formula is as follows:

and is

Wherein, t_lIndicates the end time of the sequence, (t)_l-t_l-1) The effect of (a) is to weight, the longer the time, the more weight it takes,

X’_i＝{(k_i,1,t_i,2),…,(k_i,l-1,t_i,l),…,(k_i,l-1,t_i,m)}

X’_j＝{(k_j,1,t_j,2),…,(k_j,l-1,t_j,l),…,(k_j,m-1,t_j,m)}

represents a time series of two row vectors in the data set X in slope;

and normalizing the obtained curve slope distance matrix to obtain a similarity matrix D based on the curve slope distance, wherein the formula is as follows:

further, in step 3, constructing a similarity matrix between the user electrical load data specifically includes:

P＝αA+βD

and is

α+β＝1

Wherein, the matrix A and the matrix D are respectively the matrixes obtained by carrying out extremum normalization on the similarity matrixes based on the Euclidean distance and the curve slope distance in the step 2), alpha and beta are respectively the weight coefficients of the two similarity matrixes, and P is the final similarity matrix.

Further, the step 4 specifically includes:

the similarity matrix [ P ]_i,j]_m×mSequencing each row of elements of the matrix in descending order to obtain a matrix [ P'_i,j]_m×mCalculating the difference E of adjacent consecutive elements in each column_k,j＝P’_i+1,j-P’_i,jTo obtain a matrix [ E ]_k,j]_(m-1)×mRespectively solving the maximum value of each column of the matrix E, wherein the maximum membership degree between the data and the adjacent data points is epsilon;

a gaussian kernel width parameter, gamma, is determined, the formula is as follows,

in the formula

Constructing a Gaussian kernel function matrix of the similarity matrix, wherein the calculation formula is as follows:

in the formula: sim (x)_i，x_j) Is an element P of the similarity matrix P_i,j；

Calculating Laplace matrix of similarity matrix, and calculating its eigenvalue lambda₁，…，λ_k，…，λ_nAnd a feature vector x₁，…，x_k，…，x_nThe number of eigenvalues greater than 1 is taken as the optimal classification number k, and the number of elements of each classification can be approximated by the associated eigenvalues;

obtaining the eigenvectors corresponding to the first k maximum eigenvalues; adopting K-means clustering to the feature matrix formed by the selected feature vectors, wherein the optimal clustering number is determined by the number of the feature values larger than 1; obtaining the final clustering result C (C)₁,c₂,…,c_k)；

And (3) calculating the average value of all user load curves in each cluster, and taking the average value as a typical load characteristic curve of each cluster, wherein the formula is as follows:

carrying out normalization processing on the total load curve y of the system by using extreme value normalization;

in the formula, c_i,jA load curve representing the jth user in the ith cluster, and r representing the number of users included in each cluster.

Further, in step 5, a similarity index between the typical load characteristic curve and the total load curve of the system is calculated, and is defined as:

wherein z represents a typical load characteristic curve of each cluster, yRepresents the total load curve of the system, g_zyHas a value range of [ -1, 1 [)]。

Further, the step 6 specifically includes:

1) when g is_zyIs taken to be [ -1, 0 ]]When the load trend of the user in the cluster is in negative correlation with the load trend of the whole system, namely if the user in the cluster is considered as a controllable load, the control result cannot generate positive effect on control targets such as peak clipping, valley filling and the like of a power grid, and therefore the user does not consider setting controllable capacity in demand response;

2) when the similarity index g of the typical load characteristic curve and the total load curve of the system_zyValue of [0, 1]When the load trend of the cluster of users is positively correlated with the load trend of the whole system, namely if the cluster of users is considered as a controllable load, the control result has a positive effect on control targets such as peak clipping and valley filling of the power grid, and the like, so that the power load users can be identified as the power load users with controllable potential.

Further, the step 7 specifically includes: the method for collecting the power utilization behavior influence factor information of the power load users with controllable potential by using the questionnaire survey method comprises the following steps: user response willingness, user behavior preference, user house area, user electric equipment information, home user career compensation, business user business hours and people flow density change.

Further, the step 8 specifically includes:

1) setting controllable load capacity ratio of each cluster by referring to similarity index and user electricity consumption behavior influence factor, and setting similarity index g of typical load characteristic curve and total system load curve_zyThe value of the load capacity is more than or equal to a set value a, the user response will is strong, the user behavior preference is relatively fixed, the electric equipment capacity is large, and the controllability is strong, wherein the set controllable load capacity accounts for 10% -20% of the total load of the user group;

2) for the similarity index g of the typical load curve and the total load curve of the system_zyThe value of (A) is greater than 0 but less than the set value a, the user response will is weaker, and the user behavior preference fluctuation is largeThe user group with small capacity of the electric equipment and weak controllability has the set controllable load capacity accounting for 0-10% of the total load of the user group.

Compared with the prior art, the invention has the beneficial effects that:

the invention provides a clustering method considering load curve form and trend similarity measurement, which aims to solve the defect of inaccurate classification of a clustering algorithm purely based on Euclidean distance in the prior art, and provides a controllable power load user screening method based on the similarity of a typical load characteristic curve of a user and a total load curve of a system, so that the clustering result can be deeply analyzed, and controllable load information contained in the load curve can be extracted.

The screening method provided by the invention can screen users with controllable potential in the power load users, and can distribute the controllable load capacity proportion according to the screening result.

Drawings

FIG. 1 is a block diagram of the general concept of the method of the present invention;

FIG. 2 is a block diagram of the spectral clustering algorithm of the present invention.

Detailed Description

The following detailed description of the present invention will be made with reference to the accompanying drawings.

As shown in fig. 1, a controllable load screening method based on total load curve similarity generally includes the following steps:

step 1: collecting power load data of a power consumer;

Step 1:

collecting historical data of electric loads of power consumers, preprocessing the data by taking the load active power of 48 points (0.5h1 sampling points) of each day of the power consumers in a period of time as a characteristic vector, eliminating the data with larger difference compared with the whole data, complementing the eliminated and lost data by using the average value of the data of the sections of the data at the same time of all collection days in a data collection period, and expressing the preprocessed data set as X^*。

Taking the maximum daily load of the user for the preprocessed data

For the reference value, for the data set X^*Normalization is performed, and the formula is as follows:

step 2:

the calculation formula of the Euclidean distance is as follows:

the data set X is an acquired m-dimensional electricity utilization characteristic vector data set of n electricity consumers, wherein X_iTime series X representing individual power consumer load data_i＝{(x_i,1,t₁),…,(x_i,l,t_l),…(x_i,m,t_m)}，(i＝1,2,…,n)。a_i,jRepresenting two m-dimensional row vectors X in an arbitrary n X m-dimensional space_iAnd X_jTrue distance between them, their constituent similarity moments based on Euclidean distanceThe matrix is A.

The curve slope distance is calculated as:

Dividing the number of straight line segments;

it is expressed in slope form as:

X’_i＝{(k_i,1,t_i,2),…,(k_i,l-1,t_i,l),…,(k_i,l-1,t_i,m)}

wherein k is_i,l-1Representing the slope of each segment line.

and is

X’_i＝{(k_i,1,t_i,2),…,(k_i,l-1,t_i,l),…,(k_i,l-1,t_i,m)}

X’_j＝{(k_j,1,t_j,2),…,(k_j,l-1,t_j,l),…,(k_j,m-1,t_j,m)}

represents a time series of two row vectors in the data set X in slope;

and step 3:

the method for constructing the similarity matrix among the electric load data of each user specifically comprises the following steps:

P＝αA+βD

and is

α+β＝1

And 4, step 4:

clustering is performed on the power users by using a spectral clustering algorithm, and a typical load curve of each cluster is extracted, wherein the specific steps are shown in fig. 2.

in the formula

The method comprises the following steps:

calculating the similarity index of the typical load characteristic curve and the total load curve of the system, and defining as follows:

wherein z represents a typical load characteristic curve of each cluster, y represents a total load curve of the system, and g_zyHas a value range of [ -1, 1 [)]。

Step 6:

And 7:

the method for collecting the electricity utilization behavior influence factor information of the power load users with controllable potential by using questionnaire survey and other methods comprises the following steps: user response willingness, user behavior preference, user house area, user electric equipment information, home user career compensation, business user business hours, people stream density change and the like.

And 8:

1) setting controllable load capacity ratio of each cluster by referring to similarity index and user electricity consumption behavior influence factor, and setting similarity index g of typical load characteristic curve and total system load curve_zyThe value is larger (larger than the set value a, a is 0.9), the user response will is strong, the user behavior preference is fixed, the capacity of the electric equipment is large, the controllability is strong,the proportion of the set controllable load capacity to the total load of the user group is high (10-20%).

2) The setting of the controllable load capacity needs to ensure that the response will not be violated by the user, the change of the user behavior preference belongs to an acceptable range, and the influence on the power supply satisfaction degree of the user is as small as possible. Therefore, the similarity index g of the typical load curve and the total load curve of the system_zyThe value is small (more than 0 and less than a set value a, a is 0.9), the user response will is weak, the user behavior preference fluctuation is large, the electric equipment capacity is small, the controllability is not strong, the set controllable load capacity accounts for a small proportion of the total load of the user group, or even the controllable load is not set (0% -10%).

The above embodiments are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of the present invention is not limited to the above embodiments. The methods used in the above examples are conventional methods unless otherwise specified.

Claims

1. A controllable load screening method based on total load curve similarity is characterized by comprising the following steps:

step 1: collecting power load data of a power consumer;

2. The controllable load screening method based on total load curve similarity according to claim 1, wherein in the step 2, the calculation formula of the Euclidean distance is as follows:

the data set X is an acquired m-dimensional electricity utilization characteristic vector data set of n electricity consumers, wherein X_iTime series X representing individual power consumer load data_i＝{(x_i，1，t_i，1)，…，(x_i，l，t_i，l)，…(x_i，m，t_i，m)}，(i＝1，2，…，n)；a_i，jRepresenting two m-dimensional row vectors X in an arbitrary n X m-dimensional space_iAnd X_jThe similarity matrix formed by the real distances between the two sets of the reference points and based on the Euclidean distance is A.

3. The method for controllable load screening based on total load curve similarity according to claim 1, wherein in the step 2, the curve slope distance is calculated as:

time series X_i＝{(x_i，1，t₁)，…，(x_i，l，t_l)，…(x_i，m，t_m) The piecewise linear representation of is defined as:

X_i ^*＝{(x_i，1，x_i，2，t₂)，…，(x_i，l-1，x_i，l，t_l)，…，(x_i，m-1，x_i，m，t_m)}，(i＝1，2，…，n)；

wherein x_i，l-1，x_i，l(l 2,3, … m) respectively represents the start value and the end value of the l-th straight line in the ith time sequence, t_lDenotes the time at which the l-th line ends, and m-1 denotes a time series

Dividing the number of straight line segments;

it is expressed in slope form as:

X′_i＝{(k_i，1，t₂)，…，(k_i，l-1，t_l)，…，(k_i，l-1，t_m)}

wherein k is_i，l-1Representing the slope of each segment line.

and is

X′_i＝{(k_i，1，t_i，2)，…，(k_i，l-1，t_i，l)，…，(k_i，l-1，t_i，m)}

X′_j＝{(k_j，1，t_j，2)，…，(k_j，l-1，t_j，l)，…，(k_j，m-1，t_j，m)}

represents a time series of two row vectors in the data set X in slope;

4. the controllable load screening method based on total load curve similarity according to claim 1, wherein in the step 3, constructing the similarity matrix among the user electrical load data specifically comprises:

P＝αA+βD

and is

α+β＝1

5. The controllable load screening method based on total load curve similarity according to claim 1, wherein the step 4 specifically comprises:

the similarity matrix [ P ]_i，j]_m×mSequencing each row of elements of the matrix in descending order to obtain a matrix [ P'_i，j]_m×mCalculating the difference E between two adjacent elements in each column_k，j＝P′_i+1，j-P′_i，jTo obtain a matrix [ E ]_k，j]_(m-1)×mRespectively solving the maximum value of each column of the matrix E, wherein the maximum membership degree between the data and the adjacent data points is epsilon;

in the formula

in the formula: sim (x)_i，x_j) Is an element P of the similarity matrix P_i，j；

obtaining the eigenvectors corresponding to the first k maximum eigenvalues; adopting K-means clustering to the feature matrix formed by the selected feature vectors, wherein the optimal clustering number is determined by the number of the feature values larger than 1; obtaining the final clustering result C (C)₁，c₂，…，c_k)；

in the formula, c_i，jA load curve representing the jth user in the ith cluster, and r representing the number of users included in each cluster.

6. The controllable load screening method based on total load curve similarity as claimed in claim 1, wherein in step 5, the similarity index between the typical load characteristic curve and the system total load curve is calculated and defined as:

7. The controllable load screening method based on total load curve similarity according to claim 1, wherein the step 6 specifically comprises:

8. The controllable load screening method based on total load curve similarity according to claim 1, wherein the step 7 specifically comprises: the method for collecting the electricity utilization behavior influence factor information of the power load users with controllable potential by using questionnaire survey and other methods comprises the following steps: user response willingness, user behavior preference, user house area, user electric equipment information, home user career compensation, business user business hours and people flow density change.

9. The controllable load screening method based on total load curve similarity according to claim 1, wherein the step 8 specifically comprises:

1) reference similarity indexSetting controllable load capacity ratio of each cluster according to user electricity consumption behavior influence factors, and setting similarity index g of typical load characteristic curve and total system load curve_zyThe value of the load capacity is more than or equal to a set value a, the user response will is strong, the user behavior preference is relatively fixed, the electric equipment capacity is large, and the controllability is strong, wherein the set controllable load capacity accounts for 10% -20% of the total load of the user group;

2) for the similarity index g of the typical load curve and the total load curve of the system_zyThe value of the load capacity is larger than 0 and smaller than a set value a, the response will of the user is weaker, the user behavior preference fluctuation is large, the capacity of the electric equipment is small, the controllability is not strong, and the set controllable load capacity accounts for 0-10% of the total load of the user group.