CN109886465B - Power distribution network load prediction method based on intelligent electric meter user cluster analysis - Google Patents

Power distribution network load prediction method based on intelligent electric meter user cluster analysis Download PDF

Info

Publication number
CN109886465B
CN109886465B CN201910050608.8A CN201910050608A CN109886465B CN 109886465 B CN109886465 B CN 109886465B CN 201910050608 A CN201910050608 A CN 201910050608A CN 109886465 B CN109886465 B CN 109886465B
Authority
CN
China
Prior art keywords
clustering
crow
load
feature
distribution network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910050608.8A
Other languages
Chinese (zh)
Other versions
CN109886465A (en
Inventor
黄南天
王文婷
蔡国伟
杨冬锋
黄大为
杨德友
孔令国
王燕涛
杨学航
包佳瑞琦
吴银银
张祎祺
李宏伟
陈庆珠
刘宇航
张良
刘博�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Taisite Technology Development Co ltd
Northeast Electric Power University
Economic and Technological Research Institute of State Grid Jilin Electric Power Co Ltd
Original Assignee
Northeast Dianli University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Dianli University filed Critical Northeast Dianli University
Priority to CN201910050608.8A priority Critical patent/CN109886465B/en
Publication of CN109886465A publication Critical patent/CN109886465A/en
Application granted granted Critical
Publication of CN109886465B publication Critical patent/CN109886465B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A power distribution network load prediction method based on intelligent electric meter user cluster analysis is characterized by comprising the following steps: analyzing the load fluctuation of the intelligent electric meter user, and dividing 24 hours a day into 3 periods with different fluctuation degrees according to the fluctuation degree; determining an input feature set of a predictor, and analyzing feature importance of different users under the feature set; describing user differences by using the feature importance set, carrying out SDKM clustering on the users, classifying the users with similar response degrees of input features into a class, and determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment; and selecting a random forest predictor based on ensemble learning, and respectively constructing a rolling prediction model aiming at the optimal clustering results in different fluctuation degree periods. The method solves the problems that the initial point selection of the clustering center is random and easy to fall into local optimum, reduces the prediction error of a rolling prediction model, and improves the power distribution network load prediction precision based on the intelligent ammeter user.

Description

Power distribution network load prediction method based on intelligent electric meter user cluster analysis
Technical Field
The invention relates to the technical field of electricity, in particular to a power distribution network load prediction method based on intelligent electric meter user cluster analysis.
Background
Load prediction plays a crucial role in achieving economically optimized scheduling, safe operation, and distributed renewable clean energy consumption. The accuracy of the distribution network day-ahead load prediction directly affects the economic and safe operation of the distribution system. Compared with the traditional load prediction, the distribution network day-ahead load prediction has the characteristics of strong volatility, complex distribution network user composition in the area to be predicted, large user electricity utilization behavior difference and the like, and the prediction difficulty is higher. The user electricity consumption data collected by a Smart Meter (SM) provides a data base for load characteristic analysis of mass users. On the basis of big data mining, clustering analysis can be carried out on massive users, and different prediction models can be established for each type of clustered users in a pertinence manner. And the model prediction precision is further improved by reducing the user difference in the class.
At present, intelligent electric meter user clustering is mainly used for clustering users according to load curves or statistical load characteristics of the users, and is influenced by user electric quantity, so that the users in a class are difficult to ensure to have similar response to input characteristics of a predictor, and the influence of non-load class characteristics on future electric load is difficult to analyze. Meanwhile, the influence of the total load fluctuation degree to be predicted in different time periods on the optimal clustering number of the users in the time period is not analyzed. And the initial point selection of the clustering center has randomness and is easy to fall into local optimum.
Disclosure of Invention
The invention aims to overcome the limitations and defects of the prior art and provide a power distribution network load prediction method based on intelligent electric meter user cluster analysis, which is scientific, reasonable, high in applicability and good in effect.
The purpose of the invention is realized by the following technical scheme: a power distribution network load prediction method based on intelligent electric meter user cluster analysis is characterized by comprising the following steps:
1) analyzing the load fluctuation of the intelligent electric meter user, and dividing 24 hours a day into 3 periods with different fluctuation degrees according to the fluctuation degree
The standard deviation sigma is used for embodying the fluctuation of the power consumption of the user, and the formula is as follows:
Figure BDA0001950616480000011
where σ (t) represents a standard deviation at time t, N represents the number of SMs, and N is 1,2, …, N; t represents each time, t is 1,2, …,48, Ln(t) is the load value of the nth SM at the time t;
2) determining an input feature set of a predictor, and analyzing feature importance of different users under the feature set
The mathematical model for the significance analysis of RReliefF is:
NdL=NdL+diff(L,Ri,Ij)·d(i,j)
NdF(F)=NdF(F)+diff(F,Ri,Ij)·d(i,j)
NdL·dF(F)=NdL·dF(F)+diff(L,Ri,Ij)·diff(F,Ri,Ij)·d(i,j)
Figure BDA0001950616480000021
in the formula, Ri(i 1.. m.) is a randomly drawn sample, and m is an artificially set randomly drawn sample RiNumber of times of (1)j(j ═ 1.. k.) is RiK number of neighboring samples, k number of iterations, NdLLoad values L weight, N for different samplesdF(F) Is the weight of the predicted feature F, NdL·dF(F) Adding a characteristic F weight to the load value L, diff (L, R)i,Ij) And diff (F, R)i,Ij) Calculated separately is the sample RiAnd IjThe difference in load value L and feature F; d (i, j) is the calculated sample RiAnd IjOn the basis of the distance between the two, cyclically extracting R timesiCalculating the weight of each feature, namely the importance of the feature;
3) describing user differences by using the feature importance set, carrying out SDKM clustering on the users, classifying the users with similar response degrees of input features into a class, and determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment;
firstly, a clustering algorithm is used for extracting similarity and difference among data by analyzing and mining the whole data set, and a clustering mathematical model is as follows:
Figure BDA0001950616480000022
wherein X is { X ═ Xq},q=1,2,...,Q,xqRepresenting Q objects in the dataset that need to be clustered, ckFor datasets in class K, there are a total of classes K,
Figure BDA0001950616480000023
represents ckJ is the sum of the squared errors of all classes;
initializing K clustering centers by a K-Means algorithm; then, the Euclidean distance Euc (x) from each sample in the set to the K cluster centers is calculatedq,vk) And dividing the sample into the class with the minimum distance index, wherein the Euclidean distance formula comprises the following steps:
Figure BDA0001950616480000024
in the formula, vkIs ckEuc (x)q,vk) For each sample to K cluster centers vkThe Euclidean distance of (c);
③ mathematical model of S _ Dbw:
S_Dbw(k)=Scat(k)+Dens_bw(k)
wherein Scat (k) is the mean dispersion value of the kth cluster, and Dens _ bw (k) is the intra-cluster density of the kth cluster;
and fourthly, optimizing the initial point of the clustering center by using a crow algorithm, wherein M crows can move in the dimension of a decision variable of the problem to be solved for searching a better food position, so that the dimension of the decision variable is the dimension of the initial point of the clustering center, namely the clustering number k. The position of each crow and the memory matrix LOC, MEM:
Figure BDA0001950616480000031
Figure BDA0001950616480000032
in the formula, the position of the i (i ═ 1, 2.., M) of only the crow in the M (M ═ 1, 2.., MCN) iteration is li,mRepresentative of the fact that,
Figure BDA0001950616480000033
and each crow stores the position of the hidden food in the memory vector mei,mPerforming the following steps;
in the mth iteration, crow j returns to food location mej,mIn the meantime, crowi follows crow j and finds the position, at this moment, the probability that crow j finds and changes the food place is P, and the position update of crow i is:
Figure BDA0001950616480000034
Figure BDA0001950616480000035
where Fitness () represents the Fitness function, λiAnd λjIs [0,1 ]]Obeying uniformly distributed random numbers, fl is a flight distance, if the fitness function value of the new position is superior to the original position value, the position can be updated, otherwise, the position is not updated;
4) after clustering, determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment
After determining that the new clustering method is feasible, taking a final prediction result MAPE (mean absolute percentage error, MAPE) of the predictor as an evaluation index of the optimal clustering number, wherein the MAPE is as follows:
Figure BDA0001950616480000036
in the formula, ntIs the number (n) of predicted valuest=1,2,…,Nt);LrIs the real load value; l ispThen it is the predicted load value;
5) selecting a random forest predictor based on ensemble learning, and respectively constructing rolling prediction models for optimal clustering results in different fluctuation degree periods
The random forest prediction model is as follows:
{h(x,Θd),d=1,2,...,D}
in the formula, h (x, theta)d) Representing the d decision tree theta forming the random forest, x is the input vector of the decision tree, each theta is independently distributed and represents the sample data and decision of the d tree in the random forestA random process of tree growth;
when prediction is carried out, a final prediction result y can be obtained according to the output of all decision trees in the modelp
Figure BDA0001950616480000041
Wherein D represents the number of trees in the RF; y ispdIs the predicted result of the d-th tree.
According to the method for predicting the load of the power distribution network based on the intelligent electric meter user cluster analysis, by analyzing the load fluctuation of the intelligent electric meter user, 24 hours a day are divided into 3 time periods with different fluctuation degrees according to the fluctuation degrees; determining an input feature set of a predictor, and analyzing feature importance of different users under the feature set; describing user differences by using the feature importance set, carrying out SDKM clustering on the users, classifying the users with similar response degrees of input features into a class, and determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment; and selecting a random forest predictor based on ensemble learning, and respectively constructing a rolling prediction model aiming at the optimal clustering results in different fluctuation degree periods. On one hand, the defect that the influence of non-load characteristics on clustering is difficult to analyze in the traditional clustering method is overcome, and the influence of randomly selecting the initial clustering center point on the clustering effect is reduced by optimizing the selection of the initial clustering center point; on the other hand, the power utilization fluctuation difference of users in different time intervals is considered, 24 hours are divided into time intervals with different fluctuation, clustering and targeted modeling are carried out, the overall prediction precision is remarkably improved, the random forest predictor is not affected by dimension disasters, and the rolling prediction model error can be effectively reduced by expanding the characteristic dimension of the historical load. Has the advantages of scientific and reasonable structure, strong applicability, good effect and the like.
Drawings
FIG. 1 is a flowchart of a power distribution network load prediction method based on smart meter user cluster analysis according to an embodiment;
fig. 2 is a diagram of fluctuation and data box of a total load of a smart meter user in the power distribution network load prediction method based on the smart meter user cluster analysis according to the embodiment;
FIG. 3 is a statistical chart of optimal cluster numbers of each time period in the power distribution network load prediction method based on the smart meter user cluster analysis according to the embodiment;
fig. 4 is a box diagram of load prediction errors in a power distribution network load prediction method based on smart meter user cluster analysis in working days and non-working days according to the embodiment.
Detailed Description
The invention is described more fully hereinafter with reference to the accompanying drawings and examples.
Referring to fig. 1, the method for predicting the load of the power distribution network based on the user cluster analysis of the smart meters, disclosed by the invention, comprises the following steps:
step S101, analyzing the load fluctuation of the intelligent electric meter user, and dividing 24 hours a day into 3 periods with different fluctuation degrees according to the fluctuation degree;
step S102, determining a predictor input feature set, and analyzing feature importance of different users under the feature set;
step S103, describing user differences by using the feature importance set, clustering the users by SDKM, classifying the users with similar response degrees of input features into a class, and determining the optimal clustering result of the total load fluctuation degree periods of different distribution networks in the day by adopting a statistical experiment;
and step S104, selecting a random forest predictor based on ensemble learning, and respectively constructing rolling prediction models aiming at the optimal clustering results in different fluctuation degree periods.
According to the power distribution network load prediction method based on the intelligent electric meter user cluster analysis in the exemplary embodiment of the invention, through analyzing the load fluctuation of the intelligent electric meter user, 24 hours a day is divided into 3 time periods with different fluctuation degrees according to the fluctuation degree; determining an input feature set of a predictor, and analyzing feature importance of different users under the feature set; describing user differences by using the feature importance set, carrying out SDKM clustering on the users, classifying the users with similar response degrees of input features into a class, and determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment; and selecting a random forest predictor based on ensemble learning, and respectively constructing a rolling prediction model aiming at the optimal clustering results in different fluctuation degree periods. On one hand, the defect that the influence of non-load characteristics on clustering is difficult to analyze in the traditional clustering method is overcome, and the influence of randomly selecting the initial clustering center point on the clustering effect is reduced by optimizing the selection of the initial clustering center point; on the other hand, the power utilization fluctuation difference of users in different time intervals is considered, 24 hours are divided into time intervals with different fluctuation, clustering and targeted modeling are carried out, the overall prediction precision is remarkably improved, the random forest predictor is not affected by dimension disasters, and the rolling prediction model error can be effectively reduced by expanding the characteristic dimension of the historical load.
In step S101, the load fluctuation of the user of the smart meter is analyzed, and 24 hours a day is divided into 3 periods with different fluctuation degrees according to the fluctuation degree.
Under different time periods, the overall load fluctuation of the power distribution network is different; and under different volatility, the optimal clustering number of the smart meter users also needs to be analyzed in a targeted manner. The section analyzes load fluctuation at different time intervals aiming at the data set of the resident intelligent electric meter. As shown in fig. 2, the total electricity consumption of the users in the area to be predicted in 365 days of the year can be divided into 3 periods according to the volatility of the total electricity consumption.
The standard deviation sigma is used for embodying the fluctuation of the power consumption of the user, and the formula is as follows:
Figure BDA0001950616480000051
wherein N represents the number of SMs (N is 1,2, …, N); t stands for each time (t ═ 1,2, …,48), Ln(t) is the load value of the nth SM at time t.
In step S102, a set of predictor input features is determined, and feature importance of different users under the set of features is analyzed:
the feature importance may reflect the degree of correlation between the feature and the predicted objective. And the user characteristic importance sets of different intelligent electric meters are different, so that different user loads are reflected to different response degrees of the input characteristics of the predictor. Therefore, clustering can be carried out according to the user feature importance set so as to classify users with similar responses to features into one class and carry out targeted modeling. Meanwhile, clustering analysis is carried out by adopting the feature importance set, the method is not limited by the feature dimension, and the relation between the multi-data type features and the prediction object can be analyzed.
The mathematical model for the significance analysis of RReliefF is:
NdL=NdL+diff(L,Ri,Ij)·d(i,j)
NdF(F)=NdF(F)+diff(F,Ri,Ij)·d(i,j)
NdL·dF(F)=NdL·dF(F)+diff(L,Ri,Ij)·diff(F,Ri,Ij)·d(i,j)
Figure BDA0001950616480000061
in the formula, Ri(i 1.. m.) is a randomly drawn sample, and m is an artificially set randomly drawn sample RiNumber of times of (1)j(j ═ 1.. k.) is RiK number of neighboring samples, k number of iterations, NdLLoad values L weight, N for different samplesdF(F) Is the weight of the predicted feature F, NdL·dF(F) Adding a characteristic F weight to the load value L, diff (L, R)i,Ij) And diff (F, R)i,Ij) Calculated separately is the sample RiAnd IjThe difference in load value L and feature F; d (i, j) is the calculated sample RiAnd IjOn the basis of the distance between the two, cyclically extracting R timesiThe weight of each feature, i.e., the importance of the feature, may be calculated.
In step S103, describing user differences by the feature importance sets, performing SDCKM clustering on the users, classifying users with similar input features and response degrees into a class, and determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by using a statistical experiment:
the clustering algorithm extracts the similarity and difference between data by analyzing and mining the whole data set. The K-means algorithm partitions the data set X to minimize the sum of the squared errors of all classes J:
Figure BDA0001950616480000062
in the formula ckFor a data set in the k-th class,
Figure BDA0001950616480000063
represents ckIs X ═ X, data setq},q=1,2,...,Q,xqRepresenting the Q objects in the dataset that need to be clustered.
Initializing K clustering centers by a K-Means algorithm; then, calculating Euclidean distances from each sample in the set to K clustering centers, and dividing the samples into the class with the minimum distance index; then, the average value of each class is recalculated, and this average value is taken as a new cluster center. And repeating the steps until the maximum iteration number is reached or J converges. Euclidean distance formula:
Figure BDA0001950616480000071
in the formula vkIs ckThe cluster center of (2).
In the exemplary embodiment of the invention, the clustering result in the process of optimizing the initial central point of clustering is evaluated by an S-Dbw index. After the clustering is completed, the clustering result is still generally evaluated by using the Euclidean distance as an index. Clustering results should ensure the minimum intra-cluster distance and the maximum inter-cluster distance as much as possible. But the euclidean distance only analyzes intra-cluster similarity and ignores the discreteness between clusters. In order to improve the defect, the S-Dbw distance is introduced as a clustering judgment index. S _ Dbw is determined by calculating the mean dispersion value for each cluster (summed with the intra-cluster density:
S_Dbw(k)=Scat(k)+Dens_bw(k)
wherein Scat (k) is an average dispersion value,
Figure BDA0001950616480000072
s represents the data set, σ(s)k) And σ(s) represent the standard deviation of the data in the kth cluster and the standard deviation of the overall data s, respectively; dens _ bw (k) is the intra-cluster density, and the formula is as follows.
Figure BDA0001950616480000073
Where dens () represents the average density function of the inter-cluster region, vkAnd vk’Respectively representing the clustering center points of the kth cluster and the kth' cluster; u. ofkk’Represents the middle point of the connecting line of the two cluster center points k and k'. The smaller the value of the S _ Dbw distance is, the better the clustering effect is. For the intra-cluster evaluation index, S-Dbw is the only index which has good performance in monotonicity, noise, density and inter-cluster distance.
In an exemplary embodiment of the invention, the initial cluster center selection is optimized using a crow's foot algorithm. In order to avoid the influence of randomly selecting the initial clustering center on the clustering result, the exemplary embodiment of the invention adopts a new artificial intelligence algorithm, namely a crow algorithm, to solve the optimal initial clustering center.
And optimizing the initial point of the clustering center by using a crow algorithm, wherein M crows can move in the dimension of a decision variable of a problem to be solved for searching a better food position, so that the dimension of the decision variable is the dimension of the initial point of the clustering center, namely the clustering number k. The position of each crow and the memory matrix LOC, MEM:
Figure BDA0001950616480000074
Figure BDA0001950616480000075
in the formula, the ith (i ═ 1,2,..., M) only crow in the M (M1, 2.., MCN) iterations with li,mRepresentative of the fact that,
Figure BDA0001950616480000081
and each crow stores the position of the hidden food in the memory vector mei,mIn (1).
In the mth iteration, crow j returns to food location mej,mMeanwhile, the crow i follows the crow j and finds the position, and at the moment, the probability that the crow j finds and changes the food place is P. The location update of crow i is as follows:
Figure BDA0001950616480000082
Figure BDA0001950616480000083
where Fitness () represents the Fitness function, λiAnd λjIs [0,1 ]]Obeying to uniformly distributed random numbers, fl is the flight distance. If the value of the fitness function of the new position is better than that of the original position, the position can be updated if the scheme is available, otherwise, the position is not updated.
In the exemplary embodiment of the invention, the global searching capability of the crow algorithm is combined with the local searching capability of the K-Means algorithm, the clustering quality of each time is evaluated by taking the S-Dbw distance as a fitness function and comprehensively considering the intra-cluster density and the inter-cluster average scattering degree through changing the position and memorizing of the crow, and finally the optimal clustering initial center is obtained. The SDCKM clustering procedure is as follows:
1) and initializing parameters. The scale of crow population is M; the location LOC of crow and the memory MEM; deciding variable dimensions, namely k initial clustering centers; maximum number of iterations MCN; a flight distance fl; probability of consciousness P.
2) And substituting the initial clustering center point represented by each crow memory into a K-Means algorithm to obtain a clustering result based on the initial clustering center points.
3) And calculating the Fitness value of the Fitness function. And calculating the fitness function of the clustering result based on the step 2).
Figure BDA0001950616480000084
Wherein, S _ Dbw (k) represents the S-Dbw index of the kth cluster.
4) And carrying out position updating.
5) And calculating a fitness function according to the updated position of each crow. And comparing the fitness functions, and keeping the position vector with a small fitness function value to update the memory.
6) And repeating the steps 2), 3) and 4) until the cycle number MCN is reached, and selecting the memory position with the minimum fitness value as the optimal clustering initial center.
7) After the operation of the steps is finished, the clustering initial center obtained after the optimization in the step 5) is used as the initial clustering center of the K-Means algorithm, and a final clustering scheme is generated.
In the exemplary embodiment of the invention, after the new clustering method is determined to be feasible, the final prediction result MAPE of the predictor is taken as the evaluation index of the optimal clustering number, and the optimal clustering number is determined. Introducing a random forest prediction model, and performing targeted modeling by taking the total load of each type of clustered users as a prediction target; and then summarizing all kinds of user prediction results to obtain a total load prediction conclusion of the residential area power distribution network. The smaller the MAPE, the better the prediction effect under the clustering number. MAPE is:
Figure BDA0001950616480000091
in the formula, ntIs the number (n) of predicted valuest=1,2,…,Nt),LrIs the true load value, LpThen MAPE is used to measure the prediction accuracy of the predictor as a whole for predicting the load value. As shown in fig. 3, on the basis of the SDCKM clustering method and the time domain characteristics of fluctuation of the load in the day, statistical analysis is performed for different time periods to determine the optimal clustering number in each time period. In the experiment, the clustering numbers were differentAnd determining the optimal clustering number in different time periods according to the highest prediction precision of the random forest load predictor.
In step S104, a random forest predictor based on ensemble learning is selected, and a rolling prediction model is respectively constructed for the optimal clustering results in different fluctuation degree periods:
the random forest prediction model is as follows:
{h(x,Θd),d=1,2,...,D}
in the formula, h (x, theta)d) Representing the d decision tree forming the random forest, x is an input vector of the decision tree, each theta is independently distributed and represents the random process of extracting the sample data of the d tree in the random forest and growing the decision tree.
When prediction is carried out, a final prediction result y can be obtained according to the output of all decision trees in the modelp
Figure BDA0001950616480000092
Wherein D represents the number of trees in the RF; y ispdIs the predicted result of the d-th tree.
In an exemplary embodiment of the invention, MAPE and Root Mean Square Error (RMSE) are used, where RMSE is:
Figure BDA0001950616480000093
in the formula, ntIs the number (n) of predicted valuest=1,2,…,Nt),LrIs the true load value, LpThe predicted load value is obtained.
In the exemplary embodiment of the present invention, in consideration of the influence of the difference between the residential electricity usage patterns on the load prediction on the weekday and the non-weekday, the load prediction results of the two date types are randomly extracted and compared, as shown in fig. 4.
Furthermore, the above-described drawings are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present invention, are not limited to the precise structures that have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the invention is only limited by the appended claims.

Claims (1)

1. A power distribution network load prediction method based on intelligent electric meter user cluster analysis is characterized by comprising the following steps:
1) analyzing the load fluctuation of the intelligent electric meter user, and dividing 24 hours a day into 3 periods with different fluctuation degrees according to the fluctuation degree
The standard deviation sigma is used for embodying the fluctuation of the power consumption of the user, and the formula is as follows:
Figure FDA0001950616470000011
where σ (t) represents a standard deviation at time t, N represents the number of SMs, and N is 1,2, …, N; t represents each time, t is 1,2, …,48, Ln(t) is the load value of the nth SM at the time t;
2) determining an input feature set of a predictor, and analyzing feature importance of different users under the feature set
The mathematical model for the significance analysis of RReliefF is:
NdL=NdL+diff(L,Ri,Ij)·d(i,j)
NdF(F)=NdF(F)+diff(F,Ri,Ij)·d(i,j)
NdL·dF(F)=NdL·dF(F)+diff(L,Ri,Ij)·diff(F,Ri,Ij)·d(i,j)
Figure FDA0001950616470000012
in the formula, Ri(i 1.. m.) is a randomly drawn sample, and m is an artificially set randomly drawn sample RiNumber of times of (1)j(j ═ 1.. k.) is RiK number of neighboring samples, k number of iterations, NdLLoad values L weight, N for different samplesdF(F) Is the weight of the predicted feature F, NdL·dF(F) Adding a characteristic F weight to the load value L, diff (L, R)i,Ij) And diff (F, R)i,Ij) Calculated separately is the sample RiAnd IjThe difference in load value L and feature F; d (i, j) is the calculated sample RiAnd IjOn the basis of the distance between the two, cyclically extracting R timesiCalculating the weight of each feature, namely the importance of the feature;
3) describing user differences by using the feature importance set, carrying out SDKM clustering on the users, classifying the users with similar response degrees of input features into a class, and determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment;
firstly, a clustering algorithm is used for extracting similarity and difference among data by analyzing and mining the whole data set, and a clustering mathematical model is as follows:
Figure FDA0001950616470000021
wherein X is { X ═ Xq},q=1,2,...,Q,xqRepresenting Q objects in the dataset that need to be clustered, ckFor datasets in class K, there are a total of classes K,
Figure FDA0001950616470000022
represents ckJ is the sum of the squared errors of all classes;
initializing K clustering centers by a K-Means algorithm; then, the Euclidean distance Euc (x) from each sample in the set to the K cluster centers is calculatedq,vk) And dividing the sample into the class with the minimum distance index, wherein the Euclidean distance formula comprises the following steps:
Figure FDA0001950616470000023
in the formula, vkIs ckEuc (x)q,vk) For each sample to K cluster centers vkThe Euclidean distance of (c);
③ mathematical model of S _ Dbw:
S_Dbw(k)=Scat(k)+Dens_bw(k)
wherein Scat (k) is the mean dispersion value of the kth cluster, and Dens _ bw (k) is the intra-cluster density of the kth cluster;
optimizing the initial point of the clustering center by using a crow algorithm, wherein M crows can move in the dimension of a decision variable of a problem to be solved for searching a better food position, so that the dimension of the decision variable is the dimension of the initial point of the clustering center, namely the clustering number k; the position of each crow and the memory matrix LOC, MEM:
Figure FDA0001950616470000024
Figure FDA0001950616470000025
in the formula, the position of the i (i ═ 1, 2.., M) of only the crow in the M (M ═ 1, 2.., MCN) iteration is li,mRepresentative of the fact that,
Figure FDA0001950616470000026
and each crow stores the position of the hidden food in the memory vector mei,mPerforming the following steps;
in the mth iteration, crow j returns to food location mej,mWhen the food is placed in the food storage space, the crow i follows the crow j and finds the position, at the moment, the probability that the crow j finds and replaces the food place is P, and the position of the crow i is updated as follows:
Figure FDA0001950616470000027
Figure FDA0001950616470000028
where Fitness () represents the Fitness function, λiAnd λjIs [0,1 ]]Obeying uniformly distributed random numbers, fl is a flight distance, if the fitness function value of the new position is superior to the original position value, the position can be updated, otherwise, the position is not updated;
4) after clustering, determining the optimal clustering result of different distribution network total load fluctuation degree periods in the day by adopting a statistical experiment
After determining that the new clustering method is feasible, taking a final prediction result MAPE (mean absolute percentage error, MAPE) of the predictor as an evaluation index of the optimal clustering number, wherein the MAPE is as follows:
Figure FDA0001950616470000031
in the formula, ntIs the number (n) of predicted valuest=1,2,…,Nt);LrIs the real load value; l ispThen it is the predicted load value;
5) selecting a random forest predictor based on ensemble learning, and respectively constructing rolling prediction models for optimal clustering results in different fluctuation degree periods
The random forest prediction model is as follows:
{h(x,Θd),d=1,2,...,D}
in the formula, h (x, theta)d) Representing the d decision tree theta forming the random forest, wherein x is an input vector of the decision tree, each theta is independently distributed and represents a random process of extracting the sample data of the d decision tree in the random forest and the growth of the decision tree;
when prediction is carried out, a final prediction result y can be obtained according to the output of all decision trees in the modelp
Figure FDA0001950616470000032
Wherein D represents the number of trees in the RF; y ispdIs the predicted result of the d-th tree.
CN201910050608.8A 2019-01-20 2019-01-20 Power distribution network load prediction method based on intelligent electric meter user cluster analysis Active CN109886465B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910050608.8A CN109886465B (en) 2019-01-20 2019-01-20 Power distribution network load prediction method based on intelligent electric meter user cluster analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910050608.8A CN109886465B (en) 2019-01-20 2019-01-20 Power distribution network load prediction method based on intelligent electric meter user cluster analysis

Publications (2)

Publication Number Publication Date
CN109886465A CN109886465A (en) 2019-06-14
CN109886465B true CN109886465B (en) 2022-03-18

Family

ID=66926317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910050608.8A Active CN109886465B (en) 2019-01-20 2019-01-20 Power distribution network load prediction method based on intelligent electric meter user cluster analysis

Country Status (1)

Country Link
CN (1) CN109886465B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126445A (en) * 2019-11-29 2020-05-08 国网辽宁省电力有限公司经济技术研究院 Multi-step aggregation load prediction method for mass data of intelligent electric meter
CN111222550B (en) * 2019-12-30 2023-04-21 中国电力科学研究院有限公司 User electricity consumption behavior determining method and device
CN111430024B (en) * 2020-01-06 2023-07-11 中南大学 Data decision method and system for classifying disease degree
CN111305899B (en) * 2020-02-25 2021-09-14 大连海事大学 Method for determining removal length of temporary support for construction of subway station arch cover method
CN112001441A (en) * 2020-08-24 2020-11-27 中国石油大学(华东) Power distribution network line loss anomaly detection method based on Kmeans-AHC hybrid clustering algorithm
CN113283674A (en) * 2021-06-25 2021-08-20 上海腾天节能技术有限公司 Baseline load prediction correction method based on user electricity utilization characteristics
CN113949079B (en) * 2021-11-01 2023-08-25 东南大学 Power distribution station user three-phase unbalance prediction optimization method based on deep learning
CN114282173A (en) * 2021-12-24 2022-04-05 广西电网有限责任公司 Large-scale intelligent electric meter accurate judgment calculation optimization method and system
CN114168795B (en) * 2022-02-15 2022-04-19 中航建筑工程有限公司 Building three-dimensional model mapping and storing method and device, electronic equipment and medium
CN114781685B (en) * 2022-03-17 2024-01-09 广西电网有限责任公司 Large user electricity load prediction method and system based on big data mining technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654196A (en) * 2015-12-29 2016-06-08 中国电力科学研究院 Adaptive load prediction selection method based on electric power big data
CN106485262A (en) * 2016-09-09 2017-03-08 国网山西省电力公司晋城供电公司 A kind of bus load Forecasting Methodology
CN107909213A (en) * 2017-11-27 2018-04-13 国网冀北电力有限公司 A kind of new energy load forecasting method and system based on Demand-side group
CN109063911A (en) * 2018-08-03 2018-12-21 天津相和电气科技有限公司 A kind of Load aggregation body regrouping prediction method based on gating cycle unit networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654196A (en) * 2015-12-29 2016-06-08 中国电力科学研究院 Adaptive load prediction selection method based on electric power big data
CN106485262A (en) * 2016-09-09 2017-03-08 国网山西省电力公司晋城供电公司 A kind of bus load Forecasting Methodology
WO2018045642A1 (en) * 2016-09-09 2018-03-15 国网山西省电力公司晋城供电公司 A bus bar load forecasting method
CN107909213A (en) * 2017-11-27 2018-04-13 国网冀北电力有限公司 A kind of new energy load forecasting method and system based on Demand-side group
CN109063911A (en) * 2018-08-03 2018-12-21 天津相和电气科技有限公司 A kind of Load aggregation body regrouping prediction method based on gating cycle unit networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于模糊聚类与随机森林的短期负荷预测;黄青平等;《电测与仪表》;20171210(第23期);47-52 *
基于深度学习的电网短期负荷预测方法研究;吴润泽等;《现代电力》;20171222(第02期);47-52 *

Also Published As

Publication number Publication date
CN109886465A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN109886465B (en) Power distribution network load prediction method based on intelligent electric meter user cluster analysis
Wang et al. Hour-ahead photovoltaic generation forecasting method based on machine learning and multi objective optimization algorithm
CN109919353B (en) Distributed photovoltaic prediction method of ARIMA model based on spatial correlation
CN112382352B (en) Method for quickly evaluating structural characteristics of metal organic framework material based on machine learning
CN107730054B (en) Gas load combined prediction method based on support vector regression
CN113537600B (en) Medium-long-term precipitation prediction modeling method for whole-process coupling machine learning
CN113361761A (en) Short-term wind power integration prediction method and system based on error correction
CN111738477A (en) Deep feature combination-based power grid new energy consumption capability prediction method
CN110163444A (en) A kind of water demand prediction method based on GASA-SVR
CN112364560A (en) Intelligent prediction method for working hours of mine rock drilling equipment
CN111882114B (en) Short-time traffic flow prediction model construction method and prediction method
CN116596044A (en) Power generation load prediction model training method and device based on multi-source data
CN106845696B (en) Intelligent optimization water resource configuration method
CN115759389A (en) Day-ahead photovoltaic power prediction method based on weather type similar day combination strategy
CN113379116A (en) Cluster and convolutional neural network-based line loss prediction method for transformer area
CN113570132A (en) Wind power prediction method for space-time meteorological feature extraction and deep learning
CN113706328A (en) Intelligent manufacturing capability maturity evaluation method based on FASSA-BP algorithm
Zhang et al. Load forecasting method based on improved deep learning in cloud computing environment
CN110264010A (en) Novel rural area electric power saturation load forecasting method
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN113989073A (en) Photovoltaic high-proportion distribution network voltage space-time multidimensional evaluation method based on big data mining
CN114862023A (en) Distributed photovoltaic power prediction method and system based on four-dimensional point-by-point meteorological forecast
CN114444760A (en) Industry user electric quantity prediction method based on mode extraction and error adjustment
Zhu et al. The Construction of Minimum Variables Set for Energy Prediction Models of Office Buildings
CN117610707B (en) Urban mass production space utilization prediction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231026

Address after: No. 4799, Renmin Street, Changchun City, Jilin Province

Patentee after: ECONOMIC TECHNOLOGY RESEARCH INSTITUTE OF STATE GRID JILIN ELECTRIC POWER CO.,LTD.

Patentee after: Jilin Taisite Technology Development Co.,Ltd.

Patentee after: NORTHEAST DIANLI University

Address before: 132012, Changchun Road, Jilin, Jilin, 169

Patentee before: NORTHEAST DIANLI University