CN114429172A - Load clustering method, device, equipment and medium based on transformer substation user constitution - Google Patents

Load clustering method, device, equipment and medium based on transformer substation user constitution Download PDF

Info

Publication number
CN114429172A
CN114429172A CN202111488944.4A CN202111488944A CN114429172A CN 114429172 A CN114429172 A CN 114429172A CN 202111488944 A CN202111488944 A CN 202111488944A CN 114429172 A CN114429172 A CN 114429172A
Authority
CN
China
Prior art keywords
data
load
substation
clustering
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111488944.4A
Other languages
Chinese (zh)
Inventor
霍天跃
马晓东
杨威
唐萁
周选选
袁澍阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Beijing Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Beijing Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Beijing Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202111488944.4A priority Critical patent/CN114429172A/en
Publication of CN114429172A publication Critical patent/CN114429172A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Educational Administration (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention belongs to the technical field of load clustering of power systems, and particularly discloses a load clustering method, a device, equipment and a medium based on transformer substation user composition, which comprise the following steps: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the substation load data of the substation to obtain preprocessed user load data and substation load data; selecting a clustering effectiveness evaluation index; clustering the preprocessed user load data by using a K-means algorithm improved based on a MaxMin principle; analyzing the clustering result of the subordinate users of the transformer substation to obtain the constituent components of the transformer substation; and clustering the preprocessed transformer substation load data by using a K-means algorithm with the weight automatically updated. Compared with the traditional clustering algorithm, the method has obvious advantages of correcting the convergence speed and calculating precision of the clustering algorithm, and improves the reliability of clustering by considering various data objects.

Description

Load clustering method, device, equipment and medium based on transformer substation user constitution
Technical Field
The invention belongs to the technical field of load clustering of power systems, and particularly relates to a load clustering method, device, equipment and medium based on transformer substation user composition.
Background
The power system is a unified whole consisting of a power plant, a power transmission network, a transformer substation and a power load. The accuracy of the mathematical model of each element in the system is directly related to the reliability of the simulation analysis of the system. The model of the power load as an important component of the system is still relatively rough, which directly hinders further improvement of the simulation accuracy of the system.
The transformer substation load refers to the integration of all user loads of different voltage levels under the transformer substation. The load quantity of the power system is large, and accurate transformer substation load characteristic clustering can help to revise the common characteristics of the transformer substations and extract the power utilization mode and the power utilization characteristics of the transformer substations, so that the load modeling of the transformer substations is guided, and the overall modeling accuracy is improved.
However, the problem of transformer substation load clustering at the present stage mostly has two problems to be solved, wherein firstly, the result obtained by clustering is not matched with the actual load characteristic, and secondly, the traditional clustering algorithm has no pertinence, so that the clustering effect is relatively poor.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a load clustering method, a device, equipment and a medium based on transformer substation user composition, so as to solve the problems that the result obtained by clustering is not matched with the actual load characteristic and the clustering effect is relatively poor during transformer substation load clustering.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the invention provides a load clustering method based on transformer substation user composition, which comprises the following steps:
step 1: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the subordinate substation load data of the substation to obtain preprocessed user load data and substation load data;
and 2, step: selecting a clustering effectiveness evaluation index;
and step 3: clustering the preprocessed user load data obtained in the step 1 by using a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;
and 4, step 4: and clustering the load data of the upper-layer transformer substations by using a weight automatically-updated K-means algorithm according to the constituent components of the transformer substations and the preprocessed load data of the transformer substations, determining the category number, and obtaining a final clustering result.
Further, the user load data of the subordinate transformer substations in the step 1 is a daily load characteristic curve of the low-voltage terminal users of each subordinate transformer substation, the curve is formed by real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve;
the transformer substation load data is a daily load curve of each transformer substation load, the curve is composed of real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve; or selecting the industry composition proportion of each transformer substation.
Further, the data preprocessing of the user load data and the substation load data of the subordinate substation comprises:
step 121: removing data and power generation users with data loss exceeding thirty percent, and regarding data with active recording data as negative number as power generation users; secondly, judging that the daily load curve of a low-voltage terminal user under the transformer substation and the daily load curve of the transformer substation load have obviously increased and decreased load data according to the following formula:
ρ=|(pd-pd-1)/pd|
in the formula: p isdFor data at a point in the load curve, Pd-1The data of the previous point in the load curve is shown, and rho is the change rate of the load;
when the load change rate is larger than a preset threshold value, the data is considered to be abnormal data; the preset threshold value is 30%; and filling the load data by adopting a smoothing formula for the abnormal data with the load change rate not meeting the requirement and the missing load data:
Figure BDA0003397682130000031
wherein a represents the number of reference points taken forward, b represents the number of reference points taken backward, and the values of a and b are between 4 and 6;
step 122: performing data normalization on user load data and transformer substation load data of the subordinate transformer substations by adopting a maximum value normalization method;
for the j user under the ith substation, the expression of the maximum value normalization of the user load data is as follows:
xi,j=pi,j/max(pi,j)
xi,jthe normalized vector is a load curve of a jth user under the ith transformer substation; p is a radical ofi,jThe data are data needing to be normalized in a daily load curve of a jth user under the ith transformer substation; max (p)i,j) Maximum value data in a daily load curve of a low-voltage terminal user subordinate to the transformer substation;
for the ith substation, the maximum normalized expression of the load data of the substation is as follows:
Xi=Pi/max(Pi)
Xiis as followsNormalizing the values of the i transformer substations; piThe data are data needing normalization in the daily load curve of the ith transformer substation; max (P)i) A daily load curve of the substation load;
the data preprocessing is completed through steps 121 to 122.
Further, the step 1 of performing data dimension reduction on the subordinate user load data and the subordinate transformer substation load data of the transformer substation includes:
step 13: performing dimensionality reduction on the data by adopting a PCA (principal component analysis) data dimensionality reduction method to obtain user load data and transformer substation load data of a transformer substation subordinate to the dimensionality reduction processed, wherein the dimensionality reduction processing method specifically comprises the following steps:
step 131: sample data is standardized;
step 132: calculating a correlation coefficient matrix of the original data;
step 133: solving the eigenvalue and eigenvector of the solution correlation coefficient matrix;
step 134: selecting corresponding eigenvectors in turn from large to small according to the magnitude of the eigenvalue, wherein each eigenvector can represent the coefficient of the principal component;
step 135: calculating the cumulative variance contribution rate and determining the number of the principal components;
step 136: the column writes the expression for the principal component.
Further, the step 2 of selecting a suitable clustering validity evaluation index includes
Taking the Theisenbergin index as an evaluation index of the power load clustering:
Figure BDA0003397682130000041
in the formula dk,djRespectively representing the average distance from the data object in the kth class and the jth class to the corresponding class; dk,hRepresenting the Euclidean distance from the kth class to the class center of the h class; k is a clustering number; i isDBIIs the index of Davison Castle index.
Further, the clustering of the user load data of the subordinate transformer substation in step 3 includes the following steps:
step 31: initializing, namely randomly selecting K samples from the user load data of the subordinate transformer substation obtained in the step 13 after the dimension reduction processing, and selecting an initial class center according to the following formula:
Figure BDA0003397682130000042
classifying the samples, namely dividing all the samples into class centers which are closest to the samples, wherein the class centers which are the same are one class:
Figure BDA0003397682130000043
step 32: class updating, namely updating a class center according to the division result in the step 31:
Figure BDA0003397682130000051
step 33: judging whether a convergence condition is met, and if not, continuing to perform the steps 31-32; and (4) taking the clustering effectiveness evaluation index obtained in the step (2) as a selection basis for the optimal clustering number.
Further, the clustering of the load data of the upper-level substation in step 4 includes the following steps:
step 41: initializing;
step 42: classifying the objects, namely dividing the data objects into various centers;
step 43: updating the class center, namely updating the class center according to the division result;
step 44: updating the weight;
step 45: and judging whether the convergence condition is met, if so, ending, otherwise, repeating the steps.
In a second aspect, a load clustering device based on substation user configuration includes:
the load data selection and processing module is used for acquiring subordinate user load data and substation load data of the substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the subordinate substation load data of the substation to obtain preprocessed user load data and preprocessed substation load data;
the evaluation index selection module is used for selecting the clustering effectiveness evaluation index;
the user load data clustering module is used for clustering the preprocessed user load data obtained in the step 1 by utilizing a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;
and the transformer substation load clustering module is used for clustering the upper-layer transformer substation load data by using a weight automatically-updated K-means algorithm according to the transformer substation composition and the preprocessed transformer substation load data, determining the category number and obtaining a final clustering result.
In a third aspect, a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements a load clustering method based on substation user configuration when executing the computer program.
In a fourth aspect, a computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, implements the load clustering method based on substation user configuration.
The invention has at least the following beneficial effects:
1. the invention provides a double-layer structure of a transformer substation-user, establishes a clustering model of the transformer substation load, adopts a modified k-means clustering algorithm to solve the model so as to realize accurate clustering, has obvious advantages of convergence speed and calculation accuracy of the modified clustering algorithm compared with the traditional clustering algorithm, considers various data objects simultaneously, and improves the clustering reliability.
2. The K-means algorithm improved based on the MaxMin principle is adopted to cluster the lower-layer users, the K-means algorithm with the automatically updated weights is adopted to cluster the upper-layer transformer substations, and the initial class center can be accurately selected; the impact of various data objects on the clustering results is weighed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart of a load clustering method based on substation user configuration provided by the present invention.
Fig. 2 is a schematic structural diagram of a load clustering device based on substation user configuration according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings. It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The following detailed description is exemplary in nature and is intended to provide further details of the invention. Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention.
Example 1
As shown in fig. 1, the present invention provides a load clustering method based on substation user composition, which specifically includes the following steps:
step 1: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the subordinate substation load data of the substation to obtain preprocessed user load data and substation load data;
when the user loads of the subordinate transformer substations are subjected to clustering analysis, the selected data is the daily load characteristic curve of the low-voltage terminal users of each subordinate transformer substation. When the transformer substation loads are subjected to cluster analysis, the selected data can be used for selecting a daily load curve of each transformer substation load on one hand and selecting an industry composition proportion of each transformer substation on the other hand.
In the process of data acquisition, when adverse conditions such as abnormal acquisition terminals of a power system and blocked data transmission of a communication system are met, deviation between acquired load data and original data can be caused, and the acquired data under the phenomenon is called abnormal data. Compared with the original data, the abnormal data cannot accurately show the overall rule of load change, and sometimes the clustering result is possibly influenced. After the abnormal data of the data set is identified and corrected, the load sequence needs to be normalized, and the normalization operation is to ignore the influence of the magnitude of the load data on the overall change pattern, so as to smoothly extract the complete form of the load.
The characteristic of high dimensionality of the load data causes the obvious problems of extremely long operation time and extremely low operation efficiency when the original load data is directly used for cluster analysis, and in order to improve the overall performance of the cluster analysis, after abnormal data identification correction and load data normalization, appropriate feature selection or feature construction and the like need to be carried out on the high-dimensionality load data.
The step 1 specifically comprises the following steps:
step 11: when the user loads subordinate to the transformer substation are subjected to cluster analysis, the selected data is the daily load characteristic curve of the low-voltage terminal user subordinate to each transformer substation. The curve is composed of real-time load data acquired at intervals of 15min, and the daily load characteristic curve has 96 points.
When the transformer substation loads are subjected to cluster analysis, on one hand, a daily load curve of each transformer substation load can be selected from the selected data, the curve is formed by real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve; on the other hand, the industry composition proportion of each transformer substation can be selected.
Step 12: combining a curve replacement method with a threshold discrimination method to identify and correct abnormal data;
step 121: removing data and power generation users with particularly large data loss, regarding data with over thirty percent of loss as users with particularly large loss, and regarding data with active recording data as negative data as power generation users; secondly, judging that the daily load curve of a low-voltage terminal user under the transformer substation and the daily load curve of the transformer substation load have obviously increased and decreased load data according to the following formula:
ρ=|(pd-pd-1)/pd|
in the formula: pdFor data at a point in the load curve, Pd-1The data of the previous point in the load curve is shown, and rho is the change rate of the load;
and when the load change rate is larger than a preset threshold value, the data is considered to be abnormal data. The threshold value selection obtains the change rules of loads with different industry attributes according to historical data, then a fluctuation range of the loads is set according to different time periods, and the range is used as a concrete basis to determine the threshold value range of the load data. The preset threshold value chosen here is 30%. And for the abnormal data with the load change rate not meeting the requirement and the missing load data, filling the load data by adopting a smoothing formula:
Figure BDA0003397682130000081
wherein a represents the number of reference points taken forward, b represents the number of reference points taken backward, and the values of a and b are between 4 and 6.
Step 122: performing data normalization on user load data and transformer substation load data of the subordinate transformer substations by adopting a maximum value normalization method;
for the j user under the ith substation, the expression of the maximum value normalization of the user load data is as follows:
xi,j=pi,j/max(pi,j)
xi,ja normalized vector of a jth user load curve under an ith transformer substation; p is a radical ofi,jThe data are data needing to be normalized in a daily load curve of a jth user under the ith transformer substation; max (p)i,j) The maximum value data in the daily load curve of the low-voltage terminal user under the substation.
For the ith substation, the maximum normalized expression of the load data of the substation is as follows:
Xi=Pi/max(Pi)
Xinormalizing the value of the ith transformer substation; piThe data are data needing normalization in the daily load curve of the ith transformer substation; max (P)i) A daily load curve of the substation load;
the data preprocessing is completed through steps 121 to 122.
Step 13: performing dimensionality reduction on the data by adopting a PCA (principal component analysis) data dimensionality reduction method to obtain user load data and transformer substation load data of a transformer substation subordinate to the dimensionality reduction;
principal Component Analysis (PCA) aims at simplifying data sets with minimal loss of information content of the data, and specifically: selecting a few variables which can represent the attributes or characteristics of the group of data sets from the original data sets, finding the relations among the variables, then selecting a linear combination, and using the several variables to make the linear combination to change the original data sets into comprehensive variables represented by the several characteristic data, wherein the several data can be called principal components.
The method comprises the following basic steps:
step 131: sample data is standardized;
step 132: calculating a correlation coefficient matrix of the original data;
step 133: solving the eigenvalue and eigenvector of the solution correlation coefficient matrix;
step 134: selecting corresponding characteristic vectors from large to small in sequence according to the size of the characteristic value, wherein each characteristic vector can represent the coefficient of the principal component;
step 135: calculating the cumulative variance contribution rate and determining the number of the principal components;
step 136: the column writes the expression for the principal component.
The relationship between the principal component and the original variable can be expressed as: the principal component can retain the data information of the original variable to the maximum extent; the number of the main components is far smaller than the original data volume; the main components are independent from each other and are not related to each other; each selected principal component may be a linear combination of the original variables.
Let x1,x2,…,xHDaily load data for H dimension, v11,v12,v13,…,vHHThe principal component coefficients are obtained in step 134; the principal component analysis method can convert H observed quantities of dimension to be reduced into H comprehensive indexes by using a linear combination method:
Figure BDA0003397682130000101
the above formula must satisfy the following passing conditions:
the sum of the squares of the coefficients of the principal components being equal to 1, i.e.
Figure BDA0003397682130000102
The main components are independent of each other, namely: cov (I)i,Ij)=0,i≠j,i,j=1,,2,...H
The importance of the principal component is decreasing in order, i.e. the variance decreases in order.
Step 2: selecting a clustering effectiveness evaluation index;
the DBI index is used as an evaluation index of power load clustering;
Davies-Bouldin Index (DBI), which estimates the closeness within a class as a whole in terms of the distance of the sample point within the class to the center of the class to which it belongs, and the distance between class centers can be used to represent the dispersion between classes, defined as:
Figure BDA0003397682130000103
in the formula dk,djRespectively representing the average distance from the data object in the kth class and the jth class to the class center of the corresponding class; dk,hRepresenting the Euclidean distance from the kth class to the class center of the h class; k is a clustering number; i isDBIIs the index of Davison Castle index.
IDBISmaller means better clustering.
And step 3: clustering the preprocessed user load data obtained in the step 1 by using a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;
the k-means clustering algorithm has the characteristics of simple flow, high convergence rate, high efficiency and easiness in expansion after being tested by a simple data set, and is selected as the basic clustering algorithm of the text. However, the disadvantages of this algorithm are mainly reflected in: once the initial cluster center of the cluster is selected improperly, the overall cluster is trapped into a local optimal solution.
Aiming at the problems, when clustering user load data under a transformer substation with a large data set, a k-means algorithm modified based on the MaxMin principle is provided to accurately select an initial class center. The core idea of the MaxMin principle is as follows: if there are closely-coupled classes in the data set, the Euclidean distance between objects in each class must be very small, while the clustering of objects in other classes is very far away. The specific method of the algorithm is as follows: calculating Euclidean distance between any two objects, and selecting two objects x with the maximum Euclidean distancer,xsAs an initial class center c1,c2(ii) a If the selected center number K is less than the predetermined cluster number K, all objects not selected as the initial class center are calculated toThe distance of the determined class center set c is defined as
Figure BDA0003397682130000111
Selecting the distance dc(x) The largest object is used as the new initial class center ct+1Until the number of objects in the center of the selected class is as desired.
The clustering algorithm of the user load data of the subordinate transformer substation takes the mean square error as the target function and is expressed as follows:
Figure BDA0003397682130000112
where K represents the cluster number, M represents the total number of samples, d (c)m,xn) Denotes the center c of the m-th classiEuclidean distance to the nth sample; u. ofmnIs a binary variable, u, for decisionmnTo 1, consider the nth sample as belonging to the mth class, umnAt 0, the nth sample is considered not to belong to this class. To ensure that each sample can be classified into one class only, umnThe constraint conditions to be satisfied are:
Figure BDA0003397682130000121
Figure BDA0003397682130000122
the step 3 specifically comprises the following steps:
step 31: initializing, namely randomly selecting K samples from the user load data of the subordinate transformer substation obtained in the step 13 after the dimension reduction processing, and selecting an initial class center according to the following formula:
Figure BDA0003397682130000123
classifying the samples, namely dividing all the samples into class centers which are closest to the samples, wherein the class centers which are the same are a class:
Figure BDA0003397682130000124
step 32: class updating, namely updating a class center according to the division result in the step 31:
Figure BDA0003397682130000125
step 33: and (5) judging whether the convergence condition is met or not, and if the convergence condition is not met, continuing to perform the step (31) to the step (32). And selecting the optimal clustering number according to the clustering effectiveness evaluation index obtained in the step 2.
And 3, analyzing the clustering result of the subordinate user of the transformer substation to obtain the constituent components of the transformer substation.
And 4, step 4: and clustering the load data of the upper-layer transformer substations by using a weight automatically-updated K-means algorithm according to the constituent components of the transformer substations and the preprocessed load data of the transformer substations, determining the category number, and obtaining a final clustering result.
The k-means clustering algorithm has the characteristics of simple flow, high convergence rate, high efficiency and easiness in expansion after being tested by a simple data set, and is selected as the basic clustering algorithm of the text. However, the disadvantages of this algorithm are mainly reflected in: once the initial class center of the cluster is not properly selected, the overall cluster is trapped into a local optimal solution.
When the upper-layer transformer substation is clustered, two factors of the load characteristics of the transformer substation and the transformer substation composition of the load clustering result of the subordinate users of the transformer substation are comprehensively considered in the upper-layer clustering model based on the transformer substation composition obtained by analyzing the clustering result of the subordinate users of the transformer substation. Here, the data to be clustered of the ith substation is recorded as: o isi=(Xi,Ri),XiNormalized vector, R, representing the ith substation load curveiComponent direction indicating load of substationAmount of the above XiAnd RiIn a feature space of different dimensions and different topologies. In order to simultaneously characterize such differences in feature space, the optimization model of the present document considers the upper-level substation clustering model to assign a weight w ═ w1,w2]Is the Euclidean distance of
Figure BDA0003397682130000131
In the formula, w1Is the weight, w, of the load characteristic curve of the substation2Are the weights of the constituent components of the substation, which two weights are important to satisfy the condition w1+w 21. Weight control parameter in formula
Figure BDA0003397682130000132
Which is self-specified by the user, the reason for introducing the weight control parameter is explained here. Because the influence of the daily load curve vector and the constituent proportion vector on the transformer substation load clustering is comprehensively considered, in order to distinguish the influence degree of two different spaces of the total load characteristic and the constituent of the transformer substation on the final clustering result, the w needs to be determined1And w2The value of (a) is used for balancing the influence degree of the two aspects on the transformer substation clustering, and the weight control parameter is used as a basis for w1And w2The value of (2) is selected.
The clustering model of the upper-level substation is subjected to the construction of an optimization target from the perspective of the density, and the optimization target is specifically as follows:
Figure BDA0003397682130000133
cuprepresenting cluster number of upper level substation, CmThe class center representing the mth class substation can be obtained by expanding the above equation:
Figure BDA0003397682130000134
aiming at the optimization target of the upper-layer transformer substation, a rotation optimization strategy is adopted for solving, and U is set to be S multiplied by c2The classification matrix of (a) is,the n-th row and m-column elements are um,n
Figure BDA0003397682130000135
Figure BDA0003397682130000136
And respectively representing the load data of the transformer substation and the class center vector set forming the data.
Will be provided with
Figure BDA0003397682130000141
Denoted as F (w, U, T X, T)R) The optimization model can be solved by circularly solving the following three optimization objectives:
(1) fixed weight
Figure BDA0003397682130000142
Classification matrix
Figure BDA0003397682130000143
Solving for
Figure BDA0003397682130000144
(2) Fixed weight
Figure BDA0003397682130000145
Class center
Figure BDA0003397682130000146
Solving for
Figure BDA0003397682130000147
(3) Fixed classification matrix
Figure BDA0003397682130000148
Class center
Figure BDA0003397682130000149
Solving for
Figure BDA00033976821300001410
The solution to the problem (1) is:
Figure BDA00033976821300001411
the solution to problem (2) is:
Figure BDA00033976821300001412
problem (3) is to determine w1And w2The influence degree of the load curve and the composition on the transformer substation clustering is balanced. The Lagrangian multiplier method is a method of finding the extremum of a multivariate function whose variables are constrained by one or more conditions. According to the method, the problem (3) of solving the weight w is solved by adopting a Lagrange multiplier method to obtain the weight w and the weight control parameter
Figure BDA00033976821300001413
The relationship of (1):
when the weight controls the parameter
Figure BDA00033976821300001414
When not 1, there are:
Figure BDA0003397682130000151
wherein t is 1, 2, and:
Figure BDA0003397682130000152
when the weight controls the parameter
Figure BDA0003397682130000153
In time, there are:
Figure BDA0003397682130000154
it can be seen that w1And w2Is a weight control parameter
Figure BDA0003397682130000155
For updating according to the above three formulas, controlling parameters at weights
Figure BDA0003397682130000156
The value of (a) should be an integer value not equal to 0, and the final value of (b) can be determined through multiple experiments.
The step 4 specifically comprises the following steps:
step 41: initializing;
step 42: classifying the objects, namely dividing the data objects into various centers;
step 43: updating the class center, namely updating the class center according to the division result;
step 44: updating the weight;
step 45: and judging whether the convergence condition is met, if so, ending, otherwise, repeating the steps.
Example 2
As shown in fig. 2, the present invention further provides a load clustering device based on substation users, including:
the load data selection and processing module is used for acquiring subordinate user load data and substation load data of the substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the subordinate substation load data of the substation to obtain preprocessed user load data and preprocessed substation load data;
the evaluation index selection module is used for selecting the clustering effectiveness evaluation index;
the user load data clustering module is used for clustering the preprocessed user load data obtained in the step 1 by utilizing a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;
and the transformer substation load clustering module is used for clustering the upper-layer transformer substation load data by using a weight automatically-updated K-means algorithm according to the transformer substation composition and the preprocessed transformer substation load data, determining the category number and obtaining a final clustering result.
Example 3
The invention also provides computer equipment which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the load clustering method based on the substation user constitution when executing the computer program.
Example 4
The invention also provides a computer readable storage medium, which stores a computer program, and the computer program realizes the load clustering method based on substation user composition when being executed by a processor.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. A load clustering method based on transformer substation user constitution is characterized by comprising the following steps:
step 1: acquiring subordinate user load data and substation load data of a substation as load data, and performing data preprocessing and data dimension reduction on the subordinate user load data and the substation load data of the substation to obtain preprocessed user load data and substation load data;
step 2: selecting a clustering effectiveness evaluation index;
and step 3: clustering the preprocessed user load data obtained in the step 1 by using a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;
and 4, step 4: and clustering the load data of the upper-layer transformer substations by using a weight automatically-updated K-means algorithm according to the constituent components of the transformer substations and the preprocessed load data of the transformer substations, determining the category number, and obtaining a final clustering result.
2. The load clustering method based on substation user composition according to claim 1, wherein the substation subordinate user load data in step 1 is a daily load characteristic curve of each substation subordinate low-voltage terminal user, the curve is composed of real-time load data acquired at intervals of 15min, and there are 96 points on the daily load characteristic curve;
the transformer substation load data is a daily load curve of each transformer substation load, the curve is composed of real-time load data acquired at intervals of 15min, and 96 points are arranged on the daily load characteristic curve; or selecting the industry composition proportion of each transformer substation.
3. The load clustering method based on substation user composition according to claim 2, wherein the step 1 of performing data preprocessing on the subordinate user load data and substation load data of the substation comprises:
step 121: removing data and power generation users with data loss exceeding thirty percent, and regarding the data with the active recorded data as negative data as the power generation users; secondly, judging that the daily load curve of a low-voltage terminal user under the transformer substation and the daily load curve of the transformer substation load have obviously increased and decreased load data according to the following formula:
ρ=|(pd-pd-1)/pd|
in the formula: pdFor data at a point in the load curve, Pd-1The data of the previous point in the load curve is shown, and rho is the change rate of the load;
when the load change rate is larger than a preset threshold value, the data is considered to be abnormal data; the preset threshold value is 30%; and filling the load data by adopting a smoothing formula for the abnormal data with the load change rate not meeting the requirement and the missing load data:
Figure FDA0003397682120000021
wherein a represents the number of reference points taken forward, b represents the number of reference points taken backward, and the values of a and b are between 4 and 6;
step 122: performing data normalization on user load data and transformer substation load data of the subordinate transformer substations by adopting a maximum value normalization method;
for the j user under the ith substation, the expression of the maximum value normalization of the user load data is as follows:
xi,j=pi,j/max(pi,j)
xi,jthe normalized vector is a load curve of a jth user under the ith transformer substation; p is a radical ofi,jThe data are data needing to be normalized in a daily load curve of a jth user under the ith transformer substation; max (p)i,j) The maximum value data in the daily load curve of the low-voltage terminal user belonging to the transformer substation;
for the ith substation, the maximum normalized expression of the load data of the substation is as follows:
Xi=Pi/max(Pi)
Xinormalizing the value of the ith transformer substation; p isiThe data are data needing normalization in the daily load curve of the ith transformer substation; max (P)i) A daily load curve of the substation load;
the data preprocessing is completed through steps 121 to 122.
4. The load clustering method based on substation user composition according to claim 3, wherein the step 1 of performing data dimension reduction on the subordinate user load data and substation load data of the substation comprises:
step 13: performing dimensionality reduction on the data by adopting a PCA (principal component analysis) data dimensionality reduction method to obtain user load data and transformer substation load data of a transformer substation subordinate to the dimensionality reduction processed, wherein the dimensionality reduction processing method specifically comprises the following steps:
step 131: sample data is standardized;
step 132: calculating a correlation coefficient matrix of the original data;
step 133: solving the eigenvalue and eigenvector of the solution correlation coefficient matrix;
step 134: selecting corresponding eigenvectors in turn from large to small according to the magnitude of the eigenvalue, wherein each eigenvector can represent the coefficient of the principal component;
step 135: calculating the cumulative variance contribution rate and determining the number of the principal components;
step 136: the column writes the expression for the principal component.
5. The load clustering method based on substation user composition according to claim 4, wherein the step 2 comprises:
taking the Theisenbergin index as an evaluation index of the power load clustering:
Figure FDA0003397682120000031
in the formula dk,djRespectively representing the average distance from the data object in the kth class and the jth class to the corresponding class; dk,hRepresenting the Euclidean distance from the kth class to the class center of the h class; k is a cluster number; i isDBIIs the index of Davison Castle index.
6. The load clustering method based on substation user composition according to claim 5, wherein the clustering of the subordinate user load data of the substation in step 3 comprises:
step 31: initializing, namely randomly selecting K samples from the user load data of the subordinate transformer substation obtained in the step 13 after the dimensionality reduction treatment, and selecting an initial class center according to the following formula:
Figure FDA0003397682120000032
classifying the samples, namely dividing all the samples into class centers which are closest to the samples, wherein the class centers which are the same are one class:
Figure FDA0003397682120000041
step 32: class updating, namely updating a class center according to the division result in the step 31:
Figure FDA0003397682120000042
step 33: judging whether a convergence condition is met, and if not, continuing to perform the steps 31-32; and (3) taking the clustering effectiveness evaluation index obtained in the step (2) as a selection basis for the optimal clustering number.
7. The load clustering method based on substation user composition according to claim 6, wherein the clustering of the upper-level substation load data in step 4 comprises the following steps:
step 41: initializing;
step 42: classifying the objects, namely dividing the data objects into various centers;
step 43: updating the class center, namely updating the class center according to the division result;
and step 44: updating the weight;
step 45: and judging whether the convergence condition is met, if so, ending, and otherwise, repeating the steps.
8. The utility model provides a load clustering device based on transformer substation user constitutes which characterized in that includes: :
the load data selection and processing module is used for acquiring subordinate user load data of the transformer substation and transformer substation load data as load data, and performing data preprocessing and data dimensionality reduction on the subordinate user load data of the transformer substation and the transformer substation load data to obtain preprocessed user load data and transformer substation load data;
the evaluation index selection module is used for selecting the clustering effectiveness evaluation index;
the user load data clustering module is used for clustering the preprocessed user load data obtained in the step (1) by utilizing a K-means algorithm improved based on a MaxMin principle; determining the category number, and obtaining a clustering result of a subordinate user of the transformer substation; analyzing the clustering result of the subordinate users of the transformer substation according to the clustering effectiveness evaluation index to obtain the constituent components of the transformer substation;
and the transformer substation load clustering module is used for clustering the upper-layer transformer substation load data by using a weight automatic updating K-means algorithm according to the transformer substation composition and the preprocessed transformer substation load data, determining the category number and obtaining a final clustering result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing a method of substation user composition based load clustering according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium storing a computer program which, when executed by a processor, implements a method of substation user composition-based load clustering according to any one of claims 1 to 7.
CN202111488944.4A 2021-12-07 2021-12-07 Load clustering method, device, equipment and medium based on transformer substation user constitution Pending CN114429172A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111488944.4A CN114429172A (en) 2021-12-07 2021-12-07 Load clustering method, device, equipment and medium based on transformer substation user constitution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111488944.4A CN114429172A (en) 2021-12-07 2021-12-07 Load clustering method, device, equipment and medium based on transformer substation user constitution

Publications (1)

Publication Number Publication Date
CN114429172A true CN114429172A (en) 2022-05-03

Family

ID=81311562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111488944.4A Pending CN114429172A (en) 2021-12-07 2021-12-07 Load clustering method, device, equipment and medium based on transformer substation user constitution

Country Status (1)

Country Link
CN (1) CN114429172A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859452A (en) * 2023-02-20 2023-03-28 湖南大学 Transferable load modeling method, device, equipment and medium based on data driving

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651020A (en) * 2016-12-16 2017-05-10 燕山大学 Short-term power load prediction method based on big data reduction
CN107463738A (en) * 2017-07-26 2017-12-12 浙江大学 A kind of two layers of clustering method of transformer station's load for considering to form
CN110796173A (en) * 2019-09-27 2020-02-14 昆明电力交易中心有限责任公司 Load curve form clustering algorithm based on improved kmeans

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651020A (en) * 2016-12-16 2017-05-10 燕山大学 Short-term power load prediction method based on big data reduction
CN107463738A (en) * 2017-07-26 2017-12-12 浙江大学 A kind of two layers of clustering method of transformer station's load for considering to form
CN110796173A (en) * 2019-09-27 2020-02-14 昆明电力交易中心有限责任公司 Load curve form clustering algorithm based on improved kmeans

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
程祥: "基于负荷量测数据的电力负荷聚类方法研究", 中国优秀硕士学位论文数据库 工程科技II辑, no. 7, pages 2 - 4 *
蒋正邦;吴浩;程祥;孙维真;商佳宜;: "基于多元聚类模型与两阶段聚类修正算法的变电站特性分析", 电力系统自动化, no. 15 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859452A (en) * 2023-02-20 2023-03-28 湖南大学 Transferable load modeling method, device, equipment and medium based on data driving

Similar Documents

Publication Publication Date Title
CN112699913A (en) Transformer area household variable relation abnormity diagnosis method and device
CN109472453B (en) Power consumer credit evaluation method based on global optimal fuzzy kernel clustering model
CN109657884B (en) Power grid power supply optimization method, device, equipment and computer readable storage medium
CN111476435B (en) Charging pile load prediction method based on density peak value
CN106067034B (en) Power distribution network load curve clustering method based on high-dimensional matrix characteristic root
CN112186761B (en) Wind power scene generation method and system based on probability distribution
CN112464409B (en) Vehicle performance parameter setting method and device
CN113837311B (en) Resident customer clustering method and device based on demand response data
CN111738477A (en) Deep feature combination-based power grid new energy consumption capability prediction method
CN110795690A (en) Wind power plant operation abnormal data detection method
CN112557034A (en) Bearing fault diagnosis method based on PCA _ CNNS
CN115423013A (en) Power system operation mode classification method, device, equipment and medium
CN115952832A (en) Adaptive model quantization method and apparatus, storage medium, and electronic apparatus
CN115952456A (en) Method, system, program product and storage medium for determining fault diagnosis model
CN114429172A (en) Load clustering method, device, equipment and medium based on transformer substation user constitution
CN115861671A (en) Double-layer self-adaptive clustering method considering load characteristics and adjustable potential
CN107274025B (en) System and method for realizing intelligent identification and management of power consumption mode
CN112149052B (en) Daily load curve clustering method based on PLR-DTW
CN117056761A (en) Customer subdivision method based on X-DBSCAN algorithm
CN112531725B (en) Method and system for identifying parameters of static var generator
CN111026661B (en) Comprehensive testing method and system for software usability
CN111488903A (en) Decision tree feature selection method based on feature weight
Durante et al. A portfolio diversification strategy via tail dependence measures
CN113689118B (en) Project multi-target combination optimization method and system
CN115858209B (en) Heterogeneous cross-project software defect prediction method based on information retention collaborative optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination