CN114429172A

CN114429172A - Load clustering method, device, equipment and medium based on transformer substation user constitution

Info

Publication number: CN114429172A
Application number: CN202111488944.4A
Authority: CN
Inventors: 霍天跃; 马晓东; 杨威; 唐萁; 周选选; 袁澍阳
Original assignee: State Grid Corp of China SGCC; State Grid Beijing Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Beijing Electric Power Co Ltd
Priority date: 2021-12-07
Filing date: 2021-12-07
Publication date: 2022-05-03

Abstract

The invention belongs to the technical field of load clustering of power systems, and particularly discloses a load clustering method, a device, equipment and a medium based on transformer substation user composition, which comprise the following steps: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the substation load data of the substation to obtain preprocessed user load data and substation load data; selecting a clustering effectiveness evaluation index; clustering the preprocessed user load data by using a K-means algorithm improved based on a MaxMin principle; analyzing the clustering result of the subordinate users of the transformer substation to obtain the constituent components of the transformer substation; and clustering the preprocessed transformer substation load data by using a K-means algorithm with the weight automatically updated. Compared with the traditional clustering algorithm, the method has obvious advantages of correcting the convergence speed and calculating precision of the clustering algorithm, and improves the reliability of clustering by considering various data objects.

Description

Load clustering method, device, equipment and medium based on transformer substation user constitution

Technical Field

The invention belongs to the technical field of load clustering of power systems, and particularly relates to a load clustering method, device, equipment and medium based on transformer substation user composition.

Background

The power system is a unified whole consisting of a power plant, a power transmission network, a transformer substation and a power load. The accuracy of the mathematical model of each element in the system is directly related to the reliability of the simulation analysis of the system. The model of the power load as an important component of the system is still relatively rough, which directly hinders further improvement of the simulation accuracy of the system.

The transformer substation load refers to the integration of all user loads of different voltage levels under the transformer substation. The load quantity of the power system is large, and accurate transformer substation load characteristic clustering can help to revise the common characteristics of the transformer substations and extract the power utilization mode and the power utilization characteristics of the transformer substations, so that the load modeling of the transformer substations is guided, and the overall modeling accuracy is improved.

However, the problem of transformer substation load clustering at the present stage mostly has two problems to be solved, wherein firstly, the result obtained by clustering is not matched with the actual load characteristic, and secondly, the traditional clustering algorithm has no pertinence, so that the clustering effect is relatively poor.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a load clustering method, a device, equipment and a medium based on transformer substation user composition, so as to solve the problems that the result obtained by clustering is not matched with the actual load characteristic and the clustering effect is relatively poor during transformer substation load clustering.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, the invention provides a load clustering method based on transformer substation user composition, which comprises the following steps:

step 1: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the subordinate substation load data of the substation to obtain preprocessed user load data and substation load data;

and 2, step: selecting a clustering effectiveness evaluation index;

and step 3: clustering the preprocessed user load data obtained in the step 1 by using a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;

and 4, step 4: and clustering the load data of the upper-layer transformer substations by using a weight automatically-updated K-means algorithm according to the constituent components of the transformer substations and the preprocessed load data of the transformer substations, determining the category number, and obtaining a final clustering result.

Further, the user load data of the subordinate transformer substations in the step 1 is a daily load characteristic curve of the low-voltage terminal users of each subordinate transformer substation, the curve is formed by real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve;

the transformer substation load data is a daily load curve of each transformer substation load, the curve is composed of real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve; or selecting the industry composition proportion of each transformer substation.

Further, the data preprocessing of the user load data and the substation load data of the subordinate substation comprises:

step 121: removing data and power generation users with data loss exceeding thirty percent, and regarding data with active recording data as negative number as power generation users; secondly, judging that the daily load curve of a low-voltage terminal user under the transformer substation and the daily load curve of the transformer substation load have obviously increased and decreased load data according to the following formula:

ρ＝|(p_d-p_d-1)/p_d|

in the formula: p is_dFor data at a point in the load curve, P_d-1The data of the previous point in the load curve is shown, and rho is the change rate of the load;

when the load change rate is larger than a preset threshold value, the data is considered to be abnormal data; the preset threshold value is 30%; and filling the load data by adopting a smoothing formula for the abnormal data with the load change rate not meeting the requirement and the missing load data:

wherein a represents the number of reference points taken forward, b represents the number of reference points taken backward, and the values of a and b are between 4 and 6;

step 122: performing data normalization on user load data and transformer substation load data of the subordinate transformer substations by adopting a maximum value normalization method;

for the j user under the ith substation, the expression of the maximum value normalization of the user load data is as follows:

x_i,j＝p_i,j/max(p_i,j)

x_i,jthe normalized vector is a load curve of a jth user under the ith transformer substation; p is a radical of_i,jThe data are data needing to be normalized in a daily load curve of a jth user under the ith transformer substation; max (p)_i,j) Maximum value data in a daily load curve of a low-voltage terminal user subordinate to the transformer substation;

for the ith substation, the maximum normalized expression of the load data of the substation is as follows:

X_i＝P_i/max(P_i)

X_iis as followsNormalizing the values of the i transformer substations; p_iThe data are data needing normalization in the daily load curve of the ith transformer substation; max (P)_i) A daily load curve of the substation load;

the data preprocessing is completed through steps 121 to 122.

Further, the step 1 of performing data dimension reduction on the subordinate user load data and the subordinate transformer substation load data of the transformer substation includes:

step 13: performing dimensionality reduction on the data by adopting a PCA (principal component analysis) data dimensionality reduction method to obtain user load data and transformer substation load data of a transformer substation subordinate to the dimensionality reduction processed, wherein the dimensionality reduction processing method specifically comprises the following steps:

step 131: sample data is standardized;

step 132: calculating a correlation coefficient matrix of the original data;

step 133: solving the eigenvalue and eigenvector of the solution correlation coefficient matrix;

step 134: selecting corresponding eigenvectors in turn from large to small according to the magnitude of the eigenvalue, wherein each eigenvector can represent the coefficient of the principal component;

step 135: calculating the cumulative variance contribution rate and determining the number of the principal components;

step 136: the column writes the expression for the principal component.

Further, the step 2 of selecting a suitable clustering validity evaluation index includes

Taking the Theisenbergin index as an evaluation index of the power load clustering:

in the formula d_k，d_jRespectively representing the average distance from the data object in the kth class and the jth class to the corresponding class; d_k，hRepresenting the Euclidean distance from the kth class to the class center of the h class; k is a clustering number; i is_DBIIs the index of Davison Castle index.

Further, the clustering of the user load data of the subordinate transformer substation in step 3 includes the following steps:

step 31: initializing, namely randomly selecting K samples from the user load data of the subordinate transformer substation obtained in the step 13 after the dimension reduction processing, and selecting an initial class center according to the following formula:

classifying the samples, namely dividing all the samples into class centers which are closest to the samples, wherein the class centers which are the same are one class:

step 32: class updating, namely updating a class center according to the division result in the step 31:

step 33: judging whether a convergence condition is met, and if not, continuing to perform the steps 31-32; and (4) taking the clustering effectiveness evaluation index obtained in the step (2) as a selection basis for the optimal clustering number.

Further, the clustering of the load data of the upper-level substation in step 4 includes the following steps:

step 41: initializing;

step 42: classifying the objects, namely dividing the data objects into various centers;

step 43: updating the class center, namely updating the class center according to the division result;

step 44: updating the weight;

step 45: and judging whether the convergence condition is met, if so, ending, otherwise, repeating the steps.

In a second aspect, a load clustering device based on substation user configuration includes:

the load data selection and processing module is used for acquiring subordinate user load data and substation load data of the substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the subordinate substation load data of the substation to obtain preprocessed user load data and preprocessed substation load data;

the evaluation index selection module is used for selecting the clustering effectiveness evaluation index;

the user load data clustering module is used for clustering the preprocessed user load data obtained in the step 1 by utilizing a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;

and the transformer substation load clustering module is used for clustering the upper-layer transformer substation load data by using a weight automatically-updated K-means algorithm according to the transformer substation composition and the preprocessed transformer substation load data, determining the category number and obtaining a final clustering result.

In a third aspect, a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements a load clustering method based on substation user configuration when executing the computer program.

In a fourth aspect, a computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, implements the load clustering method based on substation user configuration.

The invention has at least the following beneficial effects:

1. the invention provides a double-layer structure of a transformer substation-user, establishes a clustering model of the transformer substation load, adopts a modified k-means clustering algorithm to solve the model so as to realize accurate clustering, has obvious advantages of convergence speed and calculation accuracy of the modified clustering algorithm compared with the traditional clustering algorithm, considers various data objects simultaneously, and improves the clustering reliability.

2. The K-means algorithm improved based on the MaxMin principle is adopted to cluster the lower-layer users, the K-means algorithm with the automatically updated weights is adopted to cluster the upper-layer transformer substations, and the initial class center can be accurately selected; the impact of various data objects on the clustering results is weighed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the invention and not to limit the invention. In the drawings:

fig. 1 is a flowchart of a load clustering method based on substation user configuration provided by the present invention.

Fig. 2 is a schematic structural diagram of a load clustering device based on substation user configuration according to the present invention.

Detailed Description

The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings. It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

The following detailed description is exemplary in nature and is intended to provide further details of the invention. Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention.

Example 1

As shown in fig. 1, the present invention provides a load clustering method based on substation user composition, which specifically includes the following steps:

when the user loads of the subordinate transformer substations are subjected to clustering analysis, the selected data is the daily load characteristic curve of the low-voltage terminal users of each subordinate transformer substation. When the transformer substation loads are subjected to cluster analysis, the selected data can be used for selecting a daily load curve of each transformer substation load on one hand and selecting an industry composition proportion of each transformer substation on the other hand.

In the process of data acquisition, when adverse conditions such as abnormal acquisition terminals of a power system and blocked data transmission of a communication system are met, deviation between acquired load data and original data can be caused, and the acquired data under the phenomenon is called abnormal data. Compared with the original data, the abnormal data cannot accurately show the overall rule of load change, and sometimes the clustering result is possibly influenced. After the abnormal data of the data set is identified and corrected, the load sequence needs to be normalized, and the normalization operation is to ignore the influence of the magnitude of the load data on the overall change pattern, so as to smoothly extract the complete form of the load.

The characteristic of high dimensionality of the load data causes the obvious problems of extremely long operation time and extremely low operation efficiency when the original load data is directly used for cluster analysis, and in order to improve the overall performance of the cluster analysis, after abnormal data identification correction and load data normalization, appropriate feature selection or feature construction and the like need to be carried out on the high-dimensionality load data.

The step 1 specifically comprises the following steps:

step 11: when the user loads subordinate to the transformer substation are subjected to cluster analysis, the selected data is the daily load characteristic curve of the low-voltage terminal user subordinate to each transformer substation. The curve is composed of real-time load data acquired at intervals of 15min, and the daily load characteristic curve has 96 points.

When the transformer substation loads are subjected to cluster analysis, on one hand, a daily load curve of each transformer substation load can be selected from the selected data, the curve is formed by real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve; on the other hand, the industry composition proportion of each transformer substation can be selected.

Step 12: combining a curve replacement method with a threshold discrimination method to identify and correct abnormal data;

step 121: removing data and power generation users with particularly large data loss, regarding data with over thirty percent of loss as users with particularly large loss, and regarding data with active recording data as negative data as power generation users; secondly, judging that the daily load curve of a low-voltage terminal user under the transformer substation and the daily load curve of the transformer substation load have obviously increased and decreased load data according to the following formula:

ρ＝|(p_d-p_d-1)/p_d|

in the formula: p_dFor data at a point in the load curve, P_d-1The data of the previous point in the load curve is shown, and rho is the change rate of the load;

and when the load change rate is larger than a preset threshold value, the data is considered to be abnormal data. The threshold value selection obtains the change rules of loads with different industry attributes according to historical data, then a fluctuation range of the loads is set according to different time periods, and the range is used as a concrete basis to determine the threshold value range of the load data. The preset threshold value chosen here is 30%. And for the abnormal data with the load change rate not meeting the requirement and the missing load data, filling the load data by adopting a smoothing formula:

wherein a represents the number of reference points taken forward, b represents the number of reference points taken backward, and the values of a and b are between 4 and 6.

x_i,j＝p_i,j/max(p_i,j)

x_i,ja normalized vector of a jth user load curve under an ith transformer substation; p is a radical of_i,jThe data are data needing to be normalized in a daily load curve of a jth user under the ith transformer substation; max (p)_i,j) The maximum value data in the daily load curve of the low-voltage terminal user under the substation.

X_i＝P_i/max(P_i)

X_inormalizing the value of the ith transformer substation; p_iThe data are data needing normalization in the daily load curve of the ith transformer substation; max (P)_i) A daily load curve of the substation load;

the data preprocessing is completed through steps 121 to 122.

Step 13: performing dimensionality reduction on the data by adopting a PCA (principal component analysis) data dimensionality reduction method to obtain user load data and transformer substation load data of a transformer substation subordinate to the dimensionality reduction;

principal Component Analysis (PCA) aims at simplifying data sets with minimal loss of information content of the data, and specifically: selecting a few variables which can represent the attributes or characteristics of the group of data sets from the original data sets, finding the relations among the variables, then selecting a linear combination, and using the several variables to make the linear combination to change the original data sets into comprehensive variables represented by the several characteristic data, wherein the several data can be called principal components.

The method comprises the following basic steps:

step 131: sample data is standardized;

step 132: calculating a correlation coefficient matrix of the original data;

step 134: selecting corresponding characteristic vectors from large to small in sequence according to the size of the characteristic value, wherein each characteristic vector can represent the coefficient of the principal component;

step 136: the column writes the expression for the principal component.

The relationship between the principal component and the original variable can be expressed as: the principal component can retain the data information of the original variable to the maximum extent; the number of the main components is far smaller than the original data volume; the main components are independent from each other and are not related to each other; each selected principal component may be a linear combination of the original variables.

Let x₁，x₂，…，x_HDaily load data for H dimension, v₁₁，v₁₂，v₁₃，…，v_HHThe principal component coefficients are obtained in step 134; the principal component analysis method can convert H observed quantities of dimension to be reduced into H comprehensive indexes by using a linear combination method:

the above formula must satisfy the following passing conditions:

the sum of the squares of the coefficients of the principal components being equal to 1, i.e.

The main components are independent of each other, namely: cov (I)_i,I_j)＝0,i≠j,i,j＝1,,2,...H

The importance of the principal component is decreasing in order, i.e. the variance decreases in order.

Step 2: selecting a clustering effectiveness evaluation index;

the DBI index is used as an evaluation index of power load clustering;

Davies-Bouldin Index (DBI), which estimates the closeness within a class as a whole in terms of the distance of the sample point within the class to the center of the class to which it belongs, and the distance between class centers can be used to represent the dispersion between classes, defined as:

in the formula d_k，d_jRespectively representing the average distance from the data object in the kth class and the jth class to the class center of the corresponding class; d_k，hRepresenting the Euclidean distance from the kth class to the class center of the h class; k is a clustering number; i is_DBIIs the index of Davison Castle index.

I_DBISmaller means better clustering.

the k-means clustering algorithm has the characteristics of simple flow, high convergence rate, high efficiency and easiness in expansion after being tested by a simple data set, and is selected as the basic clustering algorithm of the text. However, the disadvantages of this algorithm are mainly reflected in: once the initial cluster center of the cluster is selected improperly, the overall cluster is trapped into a local optimal solution.

Aiming at the problems, when clustering user load data under a transformer substation with a large data set, a k-means algorithm modified based on the MaxMin principle is provided to accurately select an initial class center. The core idea of the MaxMin principle is as follows: if there are closely-coupled classes in the data set, the Euclidean distance between objects in each class must be very small, while the clustering of objects in other classes is very far away. The specific method of the algorithm is as follows: calculating Euclidean distance between any two objects, and selecting two objects x with the maximum Euclidean distance_r，x_sAs an initial class center c₁，c₂(ii) a If the selected center number K is less than the predetermined cluster number K, all objects not selected as the initial class center are calculated toThe distance of the determined class center set c is defined as

Selecting the distance d_c(x) The largest object is used as the new initial class center c_t+1Until the number of objects in the center of the selected class is as desired.

The clustering algorithm of the user load data of the subordinate transformer substation takes the mean square error as the target function and is expressed as follows:

where K represents the cluster number, M represents the total number of samples, d (c)_m，x_n) Denotes the center c of the m-th class_iEuclidean distance to the nth sample; u. of_mnIs a binary variable, u, for decision_mnTo 1, consider the nth sample as belonging to the mth class, u_mnAt 0, the nth sample is considered not to belong to this class. To ensure that each sample can be classified into one class only, u_mnThe constraint conditions to be satisfied are:

the step 3 specifically comprises the following steps:

classifying the samples, namely dividing all the samples into class centers which are closest to the samples, wherein the class centers which are the same are a class:

step 33: and (5) judging whether the convergence condition is met or not, and if the convergence condition is not met, continuing to perform the step (31) to the step (32). And selecting the optimal clustering number according to the clustering effectiveness evaluation index obtained in the step 2.

And 3, analyzing the clustering result of the subordinate user of the transformer substation to obtain the constituent components of the transformer substation.

The k-means clustering algorithm has the characteristics of simple flow, high convergence rate, high efficiency and easiness in expansion after being tested by a simple data set, and is selected as the basic clustering algorithm of the text. However, the disadvantages of this algorithm are mainly reflected in: once the initial class center of the cluster is not properly selected, the overall cluster is trapped into a local optimal solution.

When the upper-layer transformer substation is clustered, two factors of the load characteristics of the transformer substation and the transformer substation composition of the load clustering result of the subordinate users of the transformer substation are comprehensively considered in the upper-layer clustering model based on the transformer substation composition obtained by analyzing the clustering result of the subordinate users of the transformer substation. Here, the data to be clustered of the ith substation is recorded as: o is_i＝(X_i,R_i)，X_iNormalized vector, R, representing the ith substation load curve_iComponent direction indicating load of substationAmount of the above X_iAnd R_iIn a feature space of different dimensions and different topologies. In order to simultaneously characterize such differences in feature space, the optimization model of the present document considers the upper-level substation clustering model to assign a weight w ═ w₁,w₂]Is the Euclidean distance of

In the formula, w₁Is the weight, w, of the load characteristic curve of the substation₂Are the weights of the constituent components of the substation, which two weights are important to satisfy the condition w₁+w ₂1. Weight control parameter in formula

Which is self-specified by the user, the reason for introducing the weight control parameter is explained here. Because the influence of the daily load curve vector and the constituent proportion vector on the transformer substation load clustering is comprehensively considered, in order to distinguish the influence degree of two different spaces of the total load characteristic and the constituent of the transformer substation on the final clustering result, the w needs to be determined₁And w₂The value of (a) is used for balancing the influence degree of the two aspects on the transformer substation clustering, and the weight control parameter is used as a basis for w₁And w₂The value of (2) is selected.

The clustering model of the upper-level substation is subjected to the construction of an optimization target from the perspective of the density, and the optimization target is specifically as follows:

c_uprepresenting cluster number of upper level substation, C_mThe class center representing the mth class substation can be obtained by expanding the above equation:

aiming at the optimization target of the upper-layer transformer substation, a rotation optimization strategy is adopted for solving, and U is set to be S multiplied by c₂The classification matrix of (a) is,the n-th row and m-column elements are u_m,n，

And respectively representing the load data of the transformer substation and the class center vector set forming the data.

Will be provided with

Denoted as F (w, U, T X, T)^R) The optimization model can be solved by circularly solving the following three optimization objectives:

(1) fixed weight

Classification matrix

Solving for

(2) Fixed weight

Class center

Solving for

(3) Fixed classification matrix

Class center

Solving for

The solution to the problem (1) is:

the solution to problem (2) is:

problem (3) is to determine w₁And w₂The influence degree of the load curve and the composition on the transformer substation clustering is balanced. The Lagrangian multiplier method is a method of finding the extremum of a multivariate function whose variables are constrained by one or more conditions. According to the method, the problem (3) of solving the weight w is solved by adopting a Lagrange multiplier method to obtain the weight w and the weight control parameter

The relationship of (1):

when the weight controls the parameter

When not 1, there are:

wherein t is 1, 2, and:

when the weight controls the parameter

In time, there are:

it can be seen that w₁And w₂Is a weight control parameter

For updating according to the above three formulas, controlling parameters at weights

The value of (a) should be an integer value not equal to 0, and the final value of (b) can be determined through multiple experiments.

The step 4 specifically comprises the following steps:

step 41: initializing;

step 44: updating the weight;

Example 2

As shown in fig. 2, the present invention further provides a load clustering device based on substation users, including:

Example 3

The invention also provides computer equipment which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the load clustering method based on the substation user constitution when executing the computer program.

Example 4

The invention also provides a computer readable storage medium, which stores a computer program, and the computer program realizes the load clustering method based on substation user composition when being executed by a processor.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. A load clustering method based on transformer substation user constitution is characterized by comprising the following steps:

step 1: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the substation load data of the substation to obtain preprocessed user load data and substation load data;

step 2: selecting a clustering effectiveness evaluation index;

2. The load clustering method based on substation user composition according to claim 1, wherein the substation subordinate user load data in step 1 is a daily load characteristic curve of each substation subordinate low-voltage terminal user, the curve is composed of real-time load data acquired at intervals of 15min, and there are 96 points on the daily load characteristic curve;

3. The load clustering method based on substation user composition according to claim 2, wherein the step 1 of performing data preprocessing on the subordinate user load data and substation load data of the substation comprises:

step 121: removing data and power generation users with data loss exceeding thirty percent, and regarding the data with the active recorded data as negative data as the power generation users; secondly, judging that the daily load curve of a low-voltage terminal user under the transformer substation and the daily load curve of the transformer substation load have obviously increased and decreased load data according to the following formula:

ρ＝|(p_d-p_d-1)/p_d|

x_i,j＝p_i,j/max(p_i,j)

x_i,jthe normalized vector is a load curve of a jth user under the ith transformer substation; p is a radical of_i,jThe data are data needing to be normalized in a daily load curve of a jth user under the ith transformer substation; max (p)_i,j) The maximum value data in the daily load curve of the low-voltage terminal user belonging to the transformer substation;

X_i＝P_i/max(P_i)

X_inormalizing the value of the ith transformer substation; p is_iThe data are data needing normalization in the daily load curve of the ith transformer substation; max (P)_i) A daily load curve of the substation load;

the data preprocessing is completed through steps 121 to 122.

4. The load clustering method based on substation user composition according to claim 3, wherein the step 1 of performing data dimension reduction on the subordinate user load data and substation load data of the substation comprises:

step 131: sample data is standardized;

step 132: calculating a correlation coefficient matrix of the original data;

step 136: the column writes the expression for the principal component.

5. The load clustering method based on substation user composition according to claim 4, wherein the step 2 comprises:

in the formula d_k，d_jRespectively representing the average distance from the data object in the kth class and the jth class to the corresponding class; d_k，hRepresenting the Euclidean distance from the kth class to the class center of the h class; k is a cluster number; i is_DBIIs the index of Davison Castle index.

6. The load clustering method based on substation user composition according to claim 5, wherein the clustering of the subordinate user load data of the substation in step 3 comprises:

step 31: initializing, namely randomly selecting K samples from the user load data of the subordinate transformer substation obtained in the step 13 after the dimensionality reduction treatment, and selecting an initial class center according to the following formula:

step 33: judging whether a convergence condition is met, and if not, continuing to perform the steps 31-32; and (3) taking the clustering effectiveness evaluation index obtained in the step (2) as a selection basis for the optimal clustering number.

7. The load clustering method based on substation user composition according to claim 6, wherein the clustering of the upper-level substation load data in step 4 comprises the following steps:

step 41: initializing;

and step 44: updating the weight;

step 45: and judging whether the convergence condition is met, if so, ending, and otherwise, repeating the steps.

8. The utility model provides a load clustering device based on transformer substation user constitutes which characterized in that includes: :

the load data selection and processing module is used for acquiring subordinate user load data of the transformer substation and transformer substation load data as load data, and performing data preprocessing and data dimensionality reduction on the subordinate user load data of the transformer substation and the transformer substation load data to obtain preprocessed user load data and transformer substation load data;

the user load data clustering module is used for clustering the preprocessed user load data obtained in the step (1) by utilizing a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;

and the transformer substation load clustering module is used for clustering the upper-layer transformer substation load data by using a weight automatic updating K-means algorithm according to the transformer substation composition and the preprocessed transformer substation load data, determining the category number and obtaining a final clustering result.

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing a method of substation user composition based load clustering according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium storing a computer program which, when executed by a processor, implements a method of substation user composition-based load clustering according to any one of claims 1 to 7.