CN117407737A - Cloud load library establishment method based on clustering - Google Patents

Cloud load library establishment method based on clustering Download PDF

Info

Publication number
CN117407737A
CN117407737A CN202310599457.8A CN202310599457A CN117407737A CN 117407737 A CN117407737 A CN 117407737A CN 202310599457 A CN202310599457 A CN 202310599457A CN 117407737 A CN117407737 A CN 117407737A
Authority
CN
China
Prior art keywords
distribution transformer
clustering
data
distribution
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310599457.8A
Other languages
Chinese (zh)
Inventor
陆洋
高久国
徐峰
钱卫杰
刘承宗
杨超
赵健
陈子靖
徐斌
鲍雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd Anji County Power Supply Co
Shanghai University of Electric Power
Huzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd Anji County Power Supply Co
Shanghai University of Electric Power
Huzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd Anji County Power Supply Co, Shanghai University of Electric Power, Huzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd Anji County Power Supply Co
Priority to CN202310599457.8A priority Critical patent/CN117407737A/en
Publication of CN117407737A publication Critical patent/CN117407737A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a cloud load library establishment method based on clustering, which comprises the following steps: acquiring distribution transformer load data as a clustering sample; preprocessing distribution transformer load data; establishing characteristic indexes of each light distribution and transformation storage, industry and farmers; carrying out dimension reduction treatment on the load data to obtain a low-dimension electricity utilization characteristic vector; clustering the distribution transformer feature vectors; and analyzing the components of various cluster distribution transformers and establishing a typical distribution transformer cloud load library. According to the technical scheme, a large amount of distribution transformer load data containing different proportions of distributed optical storage and charging and industry and farmers are selected as samples, and a distribution transformer load library is established by utilizing a double-scale distance measurement clustering algorithm based on a distribution transformer feature set; clustering by using a double-scale distance measurement clustering algorithm based on a distribution transformer feature set; and finally, carrying out component analysis on each cluster center by combining the clustering result and the distribution transformer information, and forming a typical distribution transformer cloud load library by taking each cluster center as a typical distribution transformer type.

Description

Cloud load library establishment method based on clustering
Technical Field
The invention relates to the technical field of power systems, in particular to a cloud load library establishment method based on clustering.
Background
In a power distribution network, a distribution transformer is often referred to as a power transformer with a voltage class of 10-35KV, and is directly powered by an end user, so that the power distribution network has an important role in power supply. Because the load characteristics of the power consumer components and the power requirements of the power supply of each distribution transformer are different, an accurate and efficient method for analyzing and managing is lacking. The distribution transformer cloud load library is a database which is extracted from massive distribution transformer load data and can represent the load curve composition of typical power consumption characteristics, and the establishment of the distribution transformer cloud load library is the basis for carrying out load prediction, demand side management and other researches on the distribution transformer cloud load library.
The data display shows that the dimension of the distribution transformer cloud load data gradually becomes larger along with the popularization of intelligent measurement equipment, the traditional clustering method has poor effect when facing high-dimension data, and an accurate distribution transformer cloud load library is difficult to establish. Therefore, a cloud load library establishment method based on clustering is provided.
Chinese patent document CN103533011a discloses a cloud-based intelligent terminal data configuration method and system. The system architecture is mainly composed of AP equipment and a cloud radio frequency optimization module; the AP equipment comprises a wireless scanning module, a data collecting module, a data transmitting module, a data receiving module and a radio frequency configuration effecting module, and the cloud radio frequency optimizing module comprises a cloud data receiving module, a cloud data caching module, a cloud data loading module, a cloud data calculating module, a cloud data issuing module, a cloud timing module and a log storing module. The technical scheme is difficult to solve the technical problems that the distribution transformer has various load characteristics and the distribution transformer load library is difficult to build.
Disclosure of Invention
The invention mainly solves the technical problems that the prior technical scheme is difficult to solve the problems of various load characteristics of a distribution transformer and difficult to establish a distribution transformer load library, and provides a cloud load library establishment method based on clustering, which selects mass distribution transformer load data containing different proportions of distributed optical storage and power plant and farmers as samples and utilizes a double-scale distance measurement clustering algorithm based on a distribution transformer feature set to establish the distribution transformer load library; after preprocessing the original data, establishing various light distribution and storage and charging, and industrial and agricultural business characteristic indexes, and adopting a deep convolution automatic encoder to perform dimension reduction processing on the load data to obtain a low-dimension electricity utilization characteristic vector; clustering by using a double-scale distance measurement clustering algorithm based on the distribution transformer feature set; and finally, carrying out component analysis on each cluster center by combining the clustering result and the distribution transformer information, and forming a typical distribution transformer cloud load library by taking each cluster center as a typical distribution transformer type.
The technical problems of the invention are mainly solved by the following technical proposal: the invention comprises the following steps:
s1, acquiring distribution transformer load data as a clustering sample;
s2, preprocessing distribution transformer load data;
s3, establishing characteristic indexes of each light distribution and transformation storage and charging and industry and farmers;
s4, carrying out dimension reduction processing on the load data to obtain a low-dimension electricity utilization characteristic vector;
s5, clustering the distribution transformer feature vectors;
s6, analyzing various cluster distribution components and establishing a typical distribution cloud load library.
Selecting a large amount of distribution transformer load data containing distributed optical storage and charging with different proportions and workers and farmers as samples, and establishing a distribution transformer load library by using a double-scale distance measurement clustering algorithm based on a distribution transformer feature set; after preprocessing the original data, establishing various light distribution and storage and charging, and industrial and agricultural business characteristic indexes, and adopting a deep convolution automatic encoder to perform dimension reduction processing on the load data to obtain a low-dimension electricity utilization characteristic vector; clustering by using a double-scale distance measurement clustering algorithm based on the distribution transformer feature set; and finally, carrying out component analysis on each cluster center by combining the clustering result and the distribution transformer information, and forming a typical distribution transformer cloud load library by taking each cluster center as a typical distribution transformer type.
Preferably, the distribution transformer load data comprises distribution transformer load data of distributed optical storage and charging and industry and farmers with different proportions.
Preferably, the preprocessing in step S2 specifically includes cleaning the data, completing missing value filling and abnormal data checking and correcting, and forming the distribution transformer load matrix P from the preprocessed data.
Preferably, the step S3 specifically includes assigning a variant type, assigning a light storage and charge characteristic index to the variant containing the light storage and charge type, and assigning a commercial and industrial characteristic index to the variant containing the commercial and industrial type.
Preferably, the step S4 specifically includes establishing a convolutional automatic encoder, performing data reconstruction by using the configuration transformer load data as input to obtain a vector h of dimension reduction in each configuration transformer hidden layer as an extracted feature, and combining the vector h with a configuration transformer characteristic index to obtain a configuration transformer feature set U.
Preferably, the step S5 of clustering the distribution transformer feature vectors specifically comprises clustering the distribution transformer light storage, the worker and farmer characteristic indexes and the low-dimensional power utilization feature vectors by adopting a double-scale distance metric clustering algorithm based on the distribution transformer feature set,
s3.1, calculating the density rho of each point in the distribution transformer feature set U, and taking the point with the maximum density value as a first clustering center C 1 Removing C 1 Nearby data;
s3.2, repeating the step S3.1 to obtain other clustering centers by adopting the residual data of the distribution transformer characteristic set U until no residual data exists, and taking the clustering center C as an initial clustering center of a K-means algorithm;
s3.3, introducing a double-scale distance measurement to calculate the distance between samples in a K-means algorithm;
and S3.4, clustering the data set U based on the double-scale distance measurement to obtain a clustering result.
Preferably, the step S3.1 specifically includes setting the number of samples in the configuration change feature set U to be n, and setting the ith data to be U i ,u i For m-dimensional vectors, i.e. u i =[u i,1 ,…,u i,m ]Calculating the density rho of each point in U i
Wherein d is wd (u i ,u j ) For sample point u i And u j Weighted Euclidean distance between; w (w) k The weight of the kth feature;MeanDis (U) is the average weighted distance of all sample elements in the dataset U, expressed as:
the maximum point of the density value in U is taken asFor the first cluster center C 1 The set C of cluster centers becomes c= { C 1 Simultaneously distance C in U 1 Points smaller than meandi (U) are removed.
Preferably, the step S3.2 specifically includes calculating the density ρ (i) of the remaining data in the sample U, and selecting the sample point with the largest ρ (i) as the second cluster center C 2 The set C of cluster centers becomes c= { C 1 ,C 2 Simultaneously distance C in U 2 And removing points smaller than the MeanDis (U), repeating the steps until no residual data exists in the data set U, and selecting the obtained cluster center C as an initial cluster center of the K-means algorithm.
Preferably, the step S3.3 specifically includes introducing a double-scale distance metric to calculate a distance between samples in the K-means algorithm, and calculating a formula based on the distance of the double-scale distance metric as follows:
d tsd (u i ,u j )=αd wd (u i ,u j )+βd fd (u i ,u j )
wherein u is i And u j A sample of the distance to be calculated; d, d wd (u i ,u j ) Is u i And u j Weighted Euclidean distance between; alpha and beta are weight coefficients of two distance measures; d, d fd (u i ,u j ) Is u i And u j The fraiche distance therebetween is calculated as follows:
wherein,the re-parameterized function for the unit interval of gamma and eta is a corresponding value when the re-parameterized function tends to be infinite; d () is a metric function; k represents the kth feature calculated to the sample.
Preferably, the step S6 specifically includes performing component analysis on each cluster center by combining the clustering result and the acquired distribution transformer information, analyzing distributed optical storage and industrial and agricultural components of the cluster centers, and forming a typical distribution transformer cloud load library by using each cluster center as a typical distribution transformer type.
The beneficial effects of the invention are as follows: selecting a large amount of distribution transformer load data containing distributed optical storage and charging with different proportions and workers and farmers as samples, and establishing a distribution transformer load library by using a double-scale distance measurement clustering algorithm based on a distribution transformer feature set; after preprocessing the original data, establishing various light distribution and storage and charging, and industrial and agricultural business characteristic indexes, and adopting a deep convolution automatic encoder to perform dimension reduction processing on the load data to obtain a low-dimension electricity utilization characteristic vector; clustering by using a double-scale distance measurement clustering algorithm based on the distribution transformer feature set; and finally, carrying out component analysis on each cluster center by combining the clustering result and the distribution transformer information, and forming a typical distribution transformer cloud load library by taking each cluster center as a typical distribution transformer type.
Drawings
Fig. 1 is a flow chart of the present invention.
Fig. 2 is a diagram of the results of a simulation verification of the present invention.
Detailed Description
The technical scheme of the invention is further specifically described below through examples and with reference to the accompanying drawings.
Examples: the cloud load library establishment method based on clustering in this embodiment, as shown in fig. 1, includes the following steps:
step 1: acquiring distribution load data containing distributed optical storage and filling with different proportions and workers and farmers as clustering samples, and cleaning the data to finish missing value filling and abnormal data inspection and correction;
step 2: establishing various light distribution and storage and charging, and industrial and agricultural business characteristic indexes, and adopting a deep convolution automatic encoder to perform dimension reduction processing on load data to obtain a low-dimension electricity utilization characteristic vector;
step 3: and clustering the distribution transformer feature vectors by adopting a double-scale distance measurement clustering algorithm based on the distribution transformer feature set for the distribution transformer light storage, the worker and farmer characteristic indexes and the low-dimensional power utilization feature vectors, analyzing the distribution transformer composition components of various clusters, and realizing the accurate establishment of a typical distribution transformer cloud load library.
The step 1 requires cleaning the obtained distribution transformer load data to finish missing value filling and abnormal data checking and correcting, and comprises the following specific steps:
and (1.1) acquiring distribution transformer load data of distributed optical storage and charging and workers and farmers in different proportions.
Based on an intelligent measuring device, acquiring 96-point daily load data of N distribution transformers in m days to form a load original data matrix P of l multiplied by 96, wherein l=n multiplied by m, and a distribution transformer set N= { N 1 ,n 2 ,n 3 ,…n n The date is represented by the set m= {1,2,3, … M }. Load matrix p of each distribution transformer ni And the resulting total distribution load matrix P is shown below:
P=[P n1 ,P n2 ,…,P ni ] T l×96
(1.2) data cleaning, and completing missing value filling and abnormal data checking and correcting.
Firstly, eliminating load data with large data loss, and filling load data with less serious data loss by adopting a Lagrange interpolation method, wherein the specific filling formula is as follows:
wherein:for abnormal sample point x k,t Is a correction value of (2); x is x k,t-a 、x k,t-b Sample points taken forward and backward respectively, typically taken as 4-6; a. b is the number of sample points taken forward and backward.
(1.3) data standardization, eliminating load dimension influence.
After the missing and abnormal data are processed, standard scaler standardization is carried out on the original data by adopting Z-Score standardization, the influence of load dimension on subsequent neural network training and deep clustering is eliminated, and the standardization formula is as follows:
wherein: x is load data after cleaning; x' is the normalized load data; μ and σ are the mean and standard deviation of the sample data, respectively.
Step 2, the light storage and charge of each distribution transformer, the industrial and agricultural business characteristic index and the dimension reduction treatment of the load data by adopting a deep convolution automatic encoder are required to be established, and a low-dimension electricity utilization characteristic vector is obtained, and the specific steps are as follows:
and (2.1) establishing a distribution characteristic index according to the distribution type.
Firstly, the variable types are allocated in a region, the light storage and charge power curves are extracted from the variable types including the light storage and charge type, and characteristic indexes are formulated for the light storage and charge power curves: maximum power P of photovoltaic power generation pmax Peak time T of photovoltaic power generation peak Number of times T of energy storage charge and discharge sto Energy storage capacity E cap Number of charging piles N cha The method comprises the steps of carrying out a first treatment on the surface of the Formulating characteristic indexes for the distribution transformer containing industrial and agricultural business types: industrial user proportion L ind Proportion of agricultural users L agr Commercial user ratio L bus Peak load rate L peak Load factor L in valley period val Maximum load occurrence time T max Minimum load occurrence time T min . Wherein peak load rate L peak Load factor L in valley period val The calculation formula is as follows:
p in the formula av ,P av,peak And P av,val The daily load power average value, peak load average value, and valley load average value are represented, respectively. The peak period time is selected to be 8:00-11:00, 18:00-21:00, and the valley period is selectedThe time is 22:00-24:00, 0:00-6:00.
(2.2) creating a depth convolution auto-encoder.
For the situations of high distribution transformer load dimension and large data volume, a convolution automatic encoder is adopted to conduct data dimension reduction and feature learning, the convolution automatic encoder comprises two processes of encoding and decoding, and the assumption is that x= [ x 1 ,x 2 ,...,x i ]For normalized distribution transformer load data, the coding process is represented by the following formula:
h=σ(x * ω+b)
wherein x is E R i×j Representing a time sequence, i is the length of the time sequence, represents performing convolution operation, ω and b are the self-encoder network weights and offsets of the convolution at the encoding stage, h is the convolved data, σ is the activation function, and Relu is adopted as the activation function.
The decoding process is shown in the following formula:
x'=σ'(h*ω'+b')
where ω 'and b' represent the convolutional self-encoder network weights and offsets, respectively, during the decoding stage, x 'represents the reconstruction data of input x, σ' is the activation function in the decoder, and sigmoid is taken as the activation function.
The goal of the self-coding training is to minimize the reconstruction error, taking the mean square error as the loss function L r The loss function is defined as follows:
solving L by adopting gradient descent method r Obtaining the optimal network parameters, realizing the construction of the depth convolution self-encoder, obtaining the vector h of dimension reduction in each distribution transformer hidden layer as the extracted characteristic, and combining the extracted characteristic with the distribution transformer characteristic index to obtain a distribution transformer characteristic set U which is used as the input of the clustering algorithm in the step 3.
And 3, clustering the distribution transformer feature vectors by adopting a double-scale distance measurement clustering algorithm based on the distribution transformer feature set, analyzing the distribution transformer composition components of various clusters, and realizing the accurate establishment of a typical distribution transformer load library, wherein the method comprises the following specific steps of:
(3.1) calculating the density ρ of each point in the input data set.
Setting the number of samples in the distribution change feature set U as n and the ith data as U i ,u i For m-dimensional vectors, i.e. u i =[u i,1 ,…,u i,m ]Calculating the density rho of each point in U i
Wherein d is wd (u i ,u j ) For sample point u i And u j Weighted Euclidean distance between; w (w) k The weight of the kth feature;MeanDis (U) is the average weighted distance of all sample elements in the dataset U, expressed as:
taking the maximum point of the density value in U as a first clustering center C 1 The set C of cluster centers becomes c= { C 1 Simultaneously distance C in U 1 Points smaller than meandi (U) are removed.
(3.2) updating the clustering center and repeating the step (3.1).
Calculating the density rho (i) of the residual data in the sample D, and selecting the sample point with the largest rho (i) as a second clustering center C 2 The set C of cluster centers becomes c= { C 1 ,C 2 Simultaneously distance C in D 2 Points smaller than meandi (D) are removed. And (3.1) repeating the step until the data set D has no residual data, and selecting the obtained cluster center C as an initial cluster center of the K-means algorithm.
(3.3) introducing a two-dimensional distance measurement method.
The distance between each sample in the K-means algorithm is calculated by introducing a double-scale distance measure, and the distance calculation formula based on the double-scale distance measure is as follows:
d tsd (u i ,u j )=αd wd (u i ,u j )+βd fd (u i ,u j )
wherein u is i And u j A sample of the distance to be calculated; d, d wd (u i ,u j ) Is u i And u j Weighted Euclidean distance between; alpha and beta are weight coefficients of two distance measures; d, d fd (u i ,u j ) Is u i And u j The fraiche distance therebetween is calculated as follows:
wherein the method comprises the steps ofThe re-parameterized function for the unit interval of gamma and eta is a corresponding value when the re-parameterized function tends to be infinite; d () is a metric function; k represents the kth feature calculated to the sample.
And (3.4) carrying out component analysis on each cluster center by combining the clustering result and the acquired distribution transformer information, analyzing distributed optical storage and filling of the cluster centers and the components of industry and farmers, and forming a typical distribution transformer cloud load library by taking each cluster center as a typical distribution transformer type.
In order to verify that the method has more accurate clustering results compared with the traditional method, 3500 distribution transformer typical daily load curves are selected to verify the technical effects adopted in the method, and the experimental results are shown in fig. 2. The different methods selected in the embodiment and the comparison test performed by adopting the method compare test results by means of scientific demonstration, and the actual effect of the method is verified. The present example selects cluster index DBI (Davies-Bouldin index), CHI (Calinski-Harabasz index) and SC (Silhouette Coefficient) for quantitative analysis.
The higher the SC value, the better the clustering result. The smaller the DBI value, the better the clustering effect. The larger the CHI value, the better the clustering effect. And K-means, principal component analysis (Principal Components Analysis, PCA) +K-means, and performing comparative analysis by three conventional clustering methods based on deep embedded clustering (IDEC) of local structure retention. The comparative results are shown in the following table:
table 1 clustering results are compared.

Claims (10)

1. The cloud load library establishment method based on clustering is characterized by comprising the following steps of:
s1, acquiring distribution transformer load data as a clustering sample;
s2, preprocessing distribution transformer load data;
s3, establishing characteristic indexes of each light distribution and transformation storage and charging and industry and farmers;
s4, carrying out dimension reduction processing on the load data to obtain a low-dimension electricity utilization characteristic vector;
s5, clustering the distribution transformer feature vectors;
s6, analyzing various cluster distribution components and establishing a typical distribution cloud load library.
2. The cloud load library establishment method based on clustering according to claim 1, wherein the distribution transformer load data in the step S1 includes distribution transformer load data including distributed optical storage and filling with different proportions and industry and farmers.
3. The cloud load library establishing method based on clustering according to claim 1, wherein the preprocessing in the step S2 specifically includes cleaning data, completing missing value filling and abnormal data checking and correcting, and forming the preprocessed data into a distribution transformer load matrix P.
4. The cloud load library establishment method based on clustering according to claim 1, wherein the step S3 specifically comprises, and the distribution transformer type is used for preparing the light storage and filling characteristic indexes for the distribution transformer containing the light storage and filling type and the industrial and agricultural commercial characteristic indexes for the distribution transformer containing the industrial and agricultural commercial type.
5. The cloud load library establishing method based on clustering according to claim 1, wherein the step S4 specifically includes establishing a convolutional automatic encoder, performing data reconstruction by using distribution transformer load data as input to obtain a vector h of dimension reduction in each distribution transformer hidden layer as an extracted feature, and combining the extracted feature with a distribution transformer characteristic index to obtain a distribution transformer feature set U.
6. The method for creating cloud load library based on clustering as claimed in claim 5, wherein said step S5 of clustering each distribution transformer feature vector specifically comprises clustering each distribution transformer feature vector by adopting a dual-scale distance metric clustering algorithm based on a distribution transformer feature set on distribution transformer light storage, worker-farmer characteristic indexes and low-dimensional electricity utilization feature vectors,
s3.1, calculating the density rho of each point in the distribution transformer feature set U, and taking the point with the maximum density value as a first clustering center C 1 Removing C 1 Nearby data;
s3.2, repeating the step S3.1 to obtain other clustering centers by adopting the residual data of the distribution transformer characteristic set U until no residual data exists, and taking the clustering center C as an initial clustering center of a K-means algorithm;
s3.3, introducing a double-scale distance measurement to calculate the distance between samples in a K-means algorithm;
and S3.4, clustering the data set U based on the double-scale distance measurement to obtain a clustering result.
7. The cloud load library establishment method based on clustering as claimed in claim 6, wherein the step S3.1 specifically includes setting the number of samples in the configuration change feature set U to be n, and setting the ith data to be U i ,u i For m-dimensional vectors, i.e. u i =[u i,1 ,…,u i,m ]Calculating the density rho of each point in U i
Wherein d is wd (u i ,u j ) For sample point u i And u j Weighted Euclidean distance between; w (w) k The weight of the kth feature;MeanDis (U) is the average weighted distance of all sample elements in the dataset U, expressed as:
taking the maximum point of the density value in U as a first clustering center C 1 The set C of cluster centers becomes c= { C 1 Simultaneously distance C in U 1 Points smaller than meandi (U) are removed.
8. The cloud load library establishment method based on clustering as claimed in claim 1, wherein the step S3.2 specifically includes calculating a density ρ (i) of remaining data in the sample U, and selecting a sample point with a maximum ρ (i) as the second cluster center C 2 The set C of cluster centers becomes c= { C 1 ,C 2 Simultaneously distance C in U 2 And removing points smaller than the MeanDis (U), repeating the steps until no residual data exists in the data set U, and selecting the obtained cluster center C as an initial cluster center of the K-means algorithm.
9. The cloud load library establishment method based on clustering according to claim 7, wherein the step S3.3 specifically includes introducing a double-scale distance metric to calculate a distance between samples in a K-means algorithm, and calculating a distance calculation formula based on the double-scale distance metric as follows:
d tsd (u i ,u j )=αd wd (u i ,u j )+βd fd (u i ,u j )
wherein u is i And u j A sample of the distance to be calculated; d, d wd (u i ,u j ) Is u i And u j Weighted Euclidean distance between; alpha and beta are weight coefficients of two distance measures; d, d fd (u i ,u j ) Is u i And u j The fraiche distance therebetween is calculated as follows:
wherein,the re-parameterized function for the unit interval of gamma and eta is a corresponding value when the re-parameterized function tends to be infinite; d () is a metric function; k represents the kth feature calculated to the sample.
10. The method for establishing the cloud load library based on the clustering according to claim 1, wherein the step S6 specifically includes performing component analysis on each cluster center by combining a clustering result and the acquired distribution transformation information, analyzing distributed optical storage and filling of the cluster center and components of industry and farmers, and forming a typical distribution transformation cloud load library by taking each cluster center as a typical distribution transformation type.
CN202310599457.8A 2023-05-25 2023-05-25 Cloud load library establishment method based on clustering Pending CN117407737A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310599457.8A CN117407737A (en) 2023-05-25 2023-05-25 Cloud load library establishment method based on clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310599457.8A CN117407737A (en) 2023-05-25 2023-05-25 Cloud load library establishment method based on clustering

Publications (1)

Publication Number Publication Date
CN117407737A true CN117407737A (en) 2024-01-16

Family

ID=89493233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310599457.8A Pending CN117407737A (en) 2023-05-25 2023-05-25 Cloud load library establishment method based on clustering

Country Status (1)

Country Link
CN (1) CN117407737A (en)

Similar Documents

Publication Publication Date Title
CN111199016B (en) Daily load curve clustering method for improving K-means based on DTW
CN108416695B (en) Power load probability density prediction method, system and medium based on deep learning
CN109376772B (en) Power load combination prediction method based on neural network model
CN111008726B (en) Class picture conversion method in power load prediction
CN112257928A (en) Short-term power load probability prediction method based on CNN and quantile regression
CN112561139A (en) Short-term photovoltaic power generation power prediction method and system
CN118014041B (en) Training method and device for power equipment energy consumption prediction model
CN115577978A (en) Power distribution network target investment decision element weight coefficient measuring and calculating method
CN115345297A (en) Platform area sample generation method and system based on generation countermeasure network
CN113902183B (en) BERT-based non-invasive transformer area charging pile state monitoring and electricity price adjusting method
CN112766537B (en) Short-term electric load prediction method
CN115983738B (en) Method and device for improving gallium nitride preparation efficiency
CN112633565A (en) Photovoltaic power aggregation interval prediction method
CN112348236A (en) Abnormal daily load demand prediction system and method for intelligent power consumption monitoring terminal
CN117407737A (en) Cloud load library establishment method based on clustering
CN116432822A (en) Carbon emission data prediction method, system, equipment and readable storage medium
CN116167465A (en) Solar irradiance prediction method based on multivariate time series ensemble learning
CN114545066A (en) Non-invasive load monitoring model polymerization method and system
CN114861993A (en) Regional photovoltaic power generation prediction method based on federal learning and deep neural network
CN113705885A (en) Power distribution network voltage prediction method and system integrating VMD, XGboost and optimized TCN
CN112686443A (en) Photovoltaic power generation prediction method based on artificial intelligence
CN114898152B (en) Embedded elastic self-expanding universal learning system
CN117350549B (en) Distribution network voltage risk identification method, device and equipment considering output correlation
CN114118855B (en) CNN-based method for calculating benchmarking values of line loss rate of transformer area
CN116977652B (en) Workpiece surface morphology generation method and device based on multi-mode image generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination