AU2021105453A4 - Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree - Google Patents

Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree Download PDF

Info

Publication number
AU2021105453A4
AU2021105453A4 AU2021105453A AU2021105453A AU2021105453A4 AU 2021105453 A4 AU2021105453 A4 AU 2021105453A4 AU 2021105453 A AU2021105453 A AU 2021105453A AU 2021105453 A AU2021105453 A AU 2021105453A AU 2021105453 A4 AU2021105453 A4 AU 2021105453A4
Authority
AU
Australia
Prior art keywords
low
station area
voltage
line loss
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2021105453A
Inventor
Biyun Chen
Jiateng CHEN
Lianqiong Gan
Huiying LAN
Bin Li
Peijie LI
Junchao Liang
Yiran Zeng
Chi Zhang
Yun Zhu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University filed Critical Guangxi University
Priority to AU2021105453A priority Critical patent/AU2021105453A4/en
Application granted granted Critical
Publication of AU2021105453A4 publication Critical patent/AU2021105453A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01FMAGNETS; INDUCTANCES; TRANSFORMERS; SELECTION OF MATERIALS FOR THEIR MAGNETIC PROPERTIES
    • H01F27/00Details of transformers or inductances, in general
    • H01F27/34Special means for preventing or reducing unwanted electric or magnetic effects, e.g. no-load losses, reactive currents, harmonics, oscillations, leakage fields
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/001Methods to deal with contingencies, e.g. abnormalities, faults or failures
    • H02J3/00125Transmission line or load transient problems, e.g. overvoltage, resonance or self-excitation of inductive loads
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P80/00Climate change mitigation technologies for sector-wide applications
    • Y02P80/10Efficient use of energy, e.g. using compressed air or pressurized fluid as energy carrier
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Power Engineering (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Public Health (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a method for predicting the line loss rate of a low-voltage station area based on an extreme gradient lifting decision tree, which comprises the following steps of: Collecting original data of the low-voltage station area, and preprocessing the original data of the low-voltage station area to obtain target data of the low-voltage station area; Based on the target data of low-pressure station area, the key features are screened by feature engineering, and the characteristic index system of low-pressure station area is constructed; The second GS-XGBoost prediction model is constructed, and the line loss rate of classified low-voltage stations is predicted by the second GS-XGBoost prediction model, and the prediction results are analyzed and evaluated. The method can accurately and quickly calculate the line loss rate of the low-voltage station area, improve the accurate loss reduction capability, realize the lean management of the line loss, and provide a basis for power supply enterprises to formulate reasonable loss reduction measures. 1/6 S1' Collectionoforiginal data in low-voltage stationarea iPrepr the original data oflow-voltage S2 station area Feature engineering screens key features and constructs S3 a feature index system for low-pressure stations S4 Classi fication of low-voltage station area The GS-XGBoost line loss rate prediction model is established 5 to predict the line loss rate of various low-voltage stations, and the prediction results are analyzed and evaluated Figure 1

Description

1/6
S1' Collectionoforiginal data in low-voltage stationarea
iPrepr the original data oflow-voltage S2 station area
Feature engineering screens key features and constructs S3 a feature index system for low-pressure stations
S4 Classi fication of low-voltage station area
The GS-XGBoost line loss rate prediction model is established 5 to predict the line loss rate of various low-voltage stations, and the prediction results are analyzed and evaluated
Figure 1
Method for forecasting line loss rate in low-voltage station area based on extreme
gradient lifting decision tree
TECHNICAL FIELD
The invention belongs to the technical field of distribution network line loss
calculation, and particularly relates to a method for predicting the line loss rate of a low
voltage station area based on an extreme gradient lifting decision tree.
BACKGROUND
With the steady development of economy and the continuous improvement of living
standards, the power load of the power grid is increasing, with 10 KV power grid and 0.4
KV power grid losing the largest proportion,the medium and low voltage distribution
network accounts for 55% of the total power loss, and the 10 KV power grid loses
26.28% of the total power loss, there are serious line losses in the station area, and the
line loss problem is becoming more and more prominent. There are three main reasons
for line loss in low-voltage distribution station area:
(1) Fixed loss, including resistance loss and excitation loss caused by inner winding
and iron core of transformer; Resistance loss caused by cable lines of power grid
transmission; Electric energy loss caused by capacitor and reactance equipment deployed
in power transmission network; Electric energy loss caused by protection devices in
power network; Loss caused by medium and loss caused by power grid metering device;
(2) Management reasons, mainly referring to meter reading problems and
insufficient management of electricity stealing, etc.;
(3) Technical reasons, which mainly refer to the inconsistency of business
accounting data and the inconsistency of household change relationship.
Nowadays, traditional line loss calculation methods such as equivalent resistance
method, voltage loss method, average current method and root mean square current
method have been widely used in actual production of power enterprises. However, in the
actual operation of the power grid system, the low-voltage network, as the "hardest hit
area" of the power grid, has a large number, serious line aging, various power supply
modes, and irregular load distribution along the line. Therefore, there are bottlenecks in
line loss calculation, and the traditional line loss calculation method can not extract
valuable information from historical data for related line loss calculation. Traditional line
loss qualification rate assessment method can no longer meet the requirements of line loss
lean management. Power supply enterprises urgently need to find an effective method to
calculate line loss, dynamically predict reasonable line loss in each station area, and
provide basis for energy saving, loss reduction and planning and transformation of power
grid.
Therefore, it is an urgent technical problem to provide a fast and accurate calculation
method of line loss in the station area.
SUMMARY
In view of this, the purpose of the present invention is to provide a prediction
method of line loss rate in low-voltage station area based on extreme gradient lifting
decision tree, the method applies feature engineering and machine learning algorithm to
the prediction of line loss rate in low-voltage station area, improves the ability of accurate
loss reduction and realizes lean management of line loss through accurate line loss
prediction model, solves the problems disclosed in the background art, simplifies the
calculation process of line loss, and improves the calculation efficiency and accuracy.
To achieve the above purpose, the present invention provides the following scheme:
a prediction method of line loss rate in low-voltage station area based on extreme
gradient lifting decision tree includes:
Collecting original data of low-voltage station area, and preprocessing the original
data of low-voltage station area to obtain target data of low-voltage station area;
Selecting key features through feature engineering based on the target data of the
low-pressure station area, constructing a low-pressure station area characteristic index
system, and classifying the low-pressure station area based on the low-pressure station
area characteristic index system;
Establishing a second GS-XGBoost prediction model, predicting the line loss rate of
the classified low-voltage station area through the second GS-XGBoost prediction model,
and analyzing and evaluating the prediction results.
Preferably, collecting the original data of the low-voltage station area includes
obtaining the cross-sectional area of the main line, the total number of low-voltage
meters, the power supply, the average load rate, the total length of the line, the
distribution capacity and the power factor which reflect the station area and load
characteristics.
Preferably, the pretreatment process comprises:
Processing the missing value of the original data of the low-voltage station area
based on a sparse matrix to obtain first data; Performing abnormal data detection on the
first data to obtain second data; Extracting characteristic data based on the second data,
and carrying out standardization processing on the characteristic data to obtain target data
of a low-voltage station area.
Preferably, the key features of feature engineering screening include:
The feature index weights of the original data of low-voltage station area are
evaluated by F-test filtering method and mutual information method, and the feature set is
obtained by combining MSE. The feature set is input into the first GS-XGBoost
prediction model, the mean square error value is calculated, and the feature set with the
smallest mean square error value is compared and selected as the feature index system of
low-voltage station area.
Preferably, constructing the low-pressure station characteristic index system further
comprises determining the number of key indexes of the low-pressure station
characteristic index system.
Preferably, classifying the low-pressure station area comprises:
Determining the number of categories to be clustered and the clustering center by
inputting a low-voltage station data set into the low-voltage station characteristic index
system; The cluster center closest to the low-pressure station sample data is obtained by
calculating the distance from the low-pressure station sample data to the cluster center,
and the low-pressure station sample data is assigned to the nearest cluster center to
complete the classification of the low-pressure station.
Preferably, predicting the line loss rate of the low voltage station area comprises:
And constructing the second GS-XGBoost prediction model based on the first GS
XGBoost prediction model and the extreme gradient lifting decision tree, and inputting
the low-voltage station data set into the second GS-XGBoost prediction model to obtain a
line loss rate prediction result.
Preferably, the prediction result is analyzed and evaluated by the mean square error
MSE, the mean absolute error MAE and the root mean square error RMSE;
The mean square error MSE is the average value of the sum of squares of minimized
errors and cost functions in linear regression model fitting.
The invention discloses the following technical effects:
The invention discloses a method for predicting the line loss rate of a low-voltage
station area based on an extreme gradient lifting decision tree, which ensures the
rationality of data and improves the data quality through data preprocessing; Through
feature engineering, the redundant features are eliminated and the burden of data
collection is reduced; Through the classification of low-pressure stations, all kinds of
stations have practical and obvious characteristics and significance; The line loss rate
prediction model is constructed to predict the line loss rate in low-voltage stations, and
grid search is combined to improve the performance of the model, which greatly
improves the prediction accuracy.
According to the method, seven mainstream characteristic factors in the low-voltage
station area are converted into four main factors, so that all data characteristics can be
included, the analysis difficulty can be simplified, and the extraction of key characteristic
indexes of line loss in the low-voltage station area can be realized; By mining the line
loss data in low-voltage station area, the nonlinear relationship between the electrical
characteristic index and the line loss rate is revealed. By analyzing and evaluating the line
loss result data through an accurate line loss rate prediction model, the line loss rate in
low-voltage station area can be calculated accurately and quickly, which provides
theoretical basis and decision support for rapid evaluation, accurate calculation and loss reduction planning of line loss data in low-voltage station area, improves the ability of accurate loss reduction, and realizes lean management of line loss, thus effectively improving the standardization of line loss in low-voltage station area.
BRIEF DESCRIPTION OF THE FIGURES
In order to explain the embodiments of the present invention or the technical scheme
in the prior art more clearly, the drawings needed in the embodiments will be briefly
introduced below, obviously, the drawings in the following description are only some
embodiments of the present invention, and for ordinary technicians in the field, other
drawings can be obtained according to these drawings without paying creative labor.
Brief description of the drawings Figure 1 is a flow diagram of a method for
predicting line loss rate in low-voltage stations based on extreme gradient lifting decision
tree provided by the present invention;
Figure 2 is a graph showing filtering results of F-test and mutual information method
in an embodiment of the present invention;
Figure 3 is a line chart representing mean square error values under different feature
numbers in an embodiment of the present invention;
Figure 4 is a structural schematic diagram of a GS-XGBoost line loss prediction
model in an embodiment of the present invention;
Figure 5 is a comparison graph of line loss prediction results in an embodiment of
the present invention;
Figure 6 is a graph of line loss rate prediction results of an extreme gradient boost
decision tree (XGBoost) without parameter adjustment in an embodiment of the present
invention;
Figure 7 is a graph of line loss rate prediction results of an unadjusted random forest
(RF) model in an embodiment of the present invention.
DESCRIPTION OF THE INVENTION
The technical scheme in the embodiments of the present invention will be described
clearly and completely with reference to the drawings in the embodiments of the present
invention, obviously, the described embodiments are only part of the embodiments of the
present invention, not all of them. Based on the embodiments of the present invention, all
other embodiments obtained by ordinary technicians in the field without creative labor
belong to the scope of protection of the present invention.
In order to make the above objects, features and advantages of the present invention
more obvious and easy to understand, the present invention will be further explained in
detail with reference to the drawings and specific embodiments.
As shown in Figure 1, the invention provides a method for predicting the line loss
rate of a low-voltage station area based on an extreme gradient lifting decision tree,
which comprises the following steps:
Si: Collecting the original data of the low-voltage station area;
S2: Preprocessing the original data of the low-voltage station area;
S3: Screening key features by feature engineering, and constructing a feature index
system of low-pressure station area;
S4: Classifying the low-voltage station area;
S5: Establishing a GS-XGBoost line loss rate prediction model, predicting the line
loss rates of various low-voltage stations, and analyzing and evaluating the prediction
results.
Among them, the collection of the original data of the low-voltage station area
specifically includes the following steps:
From the line loss management system and automatic measurement acquisition
system, seven main electrical characteristics and line loss rate data, which can best reflect
the characteristics of the station area and load, including the total number of low-voltage
meters, power supply, average load rate, total line length, distribution capacity and power
factor, are obtained.
Preprocessing the original data of low-voltage station area to ensure the rationality
of the data, improve the data quality, make the data obey normal distribution, overcome
the weight difference caused by different magnitude of characteristic index, and facilitate
modeling; The method specifically comprises the following steps:
(1) Sparse matrix is used to treat missing values, and XGBoost can automatically
process missing values, for missing values, the missing value data will be divided into the
left subtree and the right subtree to calculate the loss respectively, and the better one will
be selected, and this direction will be taken as the splitting direction of missing values to
improve the sample data set;
(2) Anomaly data detection uses the isolation forest algorithm to process continuous
data, and identifies points with scattered distribution, low density and far away from high
density areas as outliers in the station data;
(3) Feature data are extracted and standardized.
Specifically, the characteristic data is normalized by Z-Score, and its transformation
function is as follows:
-'
Wherein, is the average value of the original data and is the standard
deviation.
The characteristic data is normalized by Z-Score, and the characteristic data is
transformed into dimensionless values between [0,1], so that the variable values are in the
same position in order of magnitude.
Selecting key features by feature engineering and constructing feature index system
of low-voltage station area can eliminate redundant features and reduce the burden of
data collection, including the following steps:
(1) Initially select seven mainstream electrical characteristics which are usually
available and can best reflect the station area and load characteristics;
(2) F test filtering method and mutual information method are used to evaluate the
importance of each characteristic index;
(3) Combining MSE, different numbers of feature indexes are combined into
multiple feature sets, and the feature sets are input into GS-XGBoost model, and their
corresponding mean square error values in the model are calculated respectively; The
GS-XGBoost model proposed here is not the final model, but by comparing the mean
square error values of different models, the feature set that minimizes the mean square
error value is selected as the final key feature index system, and the number of key
indicators in the final feature index system is determined.
(4) Select the feature set that minimizes the mean square error value as the final key
feature index system, and determine the number of key indicators in the final feature
index system.
Specifically, F-test filtering method, also known as variance homogeneity test, is a
filtering method used to capture the linear relationship between each feature, and features
with p value less than 0.01 or 0.05 are selected as significant linear correlation features,
X={X,,X 2,...,x} the F-test filtering law stipulates that feature data and line loss rate
y={yI,y2 ,...,yn)
are two data sets that obey normal distribution, and the distribution
F(n-1,n-1) calculation formula of F-test filtering method is as follows:
s2 F= x S2y
s2 xy SY2 In the above formula, and are the corresponding variance, and the
calculation formula is as follows:
n - 1 ,1)
in S = 1 $1 iY2
In the above formula, and are the corresponding mean value, and the
calculation formula is as follows:
In n i=1
1 n Y 1yi n =1
Furthermore, the mutual information method evaluates the correlation between
independent variables and dependent variables by capturing the arbitrary relationship
between each feature and dependent variables. The value range of MI is [0,1], where 0
means that the two variables are independent of each other, 1 means that the two
variables are completely related, and the greater the value of (0,1), the more significant
the correlation is.
The calculation formula of mutual information is as follows:
I(X; Y)= IIP(x, y)log '(,
) .x 'P(x)P(y)
x In the formula, the probability of feature appearing in the whole training set is
P(x) P(y) y expressed as indicates the probability of appearing in the whole
training set.
The formula for calculating MSE is as follows:
MSE = y- f(i))2 n =1
;i) "(I)
In the formula, is the true value and is the predicted value, and the smaller
the mean square error, the more accurate the prediction result of the model is.
In this embodiment, the correlation of characteristic indexes in the low-voltage
station area is shown in Figure 2. it can be seen from Figure 2 that the F value and MI
x1 value of the cross-sectional area of the trunk line of characteristic and the characteristic score scores are 1, 1 and 14.19, which are the largest, indicating that they are strongly correlated indexes. secondly, the F value and MI value of the total length of
X5
line are relatively large, and the characteristic score is 7.38, which is relatively stable,
X2 4 at the same time, the values of features and and MI are both 0 and the feature
score is the lowest, which indicates that the total number of low-voltage meters and the
average load rate are weakly correlated with the line loss rate. Therefore, it can be said
that the F-test filtering method and the mutual information method are consistent, so that
the related features can be filtered.
The final feature index system of low-pressure station area is shown in Figure 3, it
can be seen from Figure 3 that when the number of features is 4, the mean square error
value is the smallest, which means that the prediction performance is the best at this time,
so the optimal number of features is 4.
Classifying the low-voltage station area specifically includes the following steps:
Letting the set of sample points in the station area be
L ={(XI,yj),(X2, y2),.---,(X ,yj)) X, =(x,Xa,1''Xi.n) in which each variable is
k Inputting the data set of low-voltage station area, select the number of categories
k {C1,C 2 ,.-,Ck} 1<k:n to be clustered, and select clustering centers, ,
Calculating separately the standardized Euclidean distance between each sample
point and each cluster center, and finding the nearest cluster center for each sample point,
the calculation formula is:
dis (X,, Cj| ,-,3
xi C. In this formula, represents the i-th sample point, represents the j-th cluster
1 j ! k xi, 1t m center , represents the t-th feature of the i-th sample point, ,and
cit represents the t-th feature of thej-th cluster center.
Compare the distance from each sample point to each cluster center in turn, and
k assign the sample points to the cluster of the nearest cluster center, so as to obtain
{SlS 2 ,---,Sk} cluster
In this embodiment, K-Means algorithm is used to calculate the clustering center for
the characteristic indexes in the characteristic index system, and the clustering results are
shown in the following Table 1:
Table 1
Characteristic index Distibuiom TYPe et (m)Totalimeiagth (kn) t--f--e (KVA) "
A 61.29-1 37T26.21 19S. 29 0.91 7 71.283 8 153; 192.09 092
To sum up, each low-pressure station area has its practical significance, which
shows that the clustering effect is quite good. Line aging, line diameter, transformer
upgrade, etc. will lead to a large fluctuation of line loss rate, so it is normal for clustering
results to change accordingly.
The GS-XGBoost line loss rate prediction model is constructed to predict the line
loss rates of various low-voltage stations, and grid search is combined to improve the
performance of the model and improve the prediction accuracy, which specifically
includes the following steps:
As shown in Figure 4, the GS-XGBoost prediction model is constructed by
combining grid search, and X is taken as the input station feature vector to calculate the
final predicted value of the line loss rate of low voltage station, the calculation formula is
as follows:
F,, =,80 +p1fl(X,) )+162f2 ( Xi) +---+p,,f, (X, )
F In which, m is the final predicted value, Pm is the shrinkage coefficient of the mth
tree, and fm(x) is the predicted value corresponding to the m-th tree.
In order to prevent over-fitting, regular terms are added and the complexity function
of decision tree is introduced:
1T Q(fm) yT±+ Al 2 Wi
In the formula, 7 is the coefficient of leaf nodes, is the coefficient of L2
square modulus, T is the total number of leaf nodes of the tree, and wi is the output
score value of thej-th leaf node of the tree;
The formula for constructing objective function is as follows:
Obj("_=) ((Y )pf (x+))+Q(f )+C
In which, Yi means to keep the model prediction of previous round M-1, and
C is a constant term.
The formula for optimizing the objective function by Taylor second-order expansion
is as follows:
1 22 ] Obj(" 'Ta'or + g (xf.,(xi)+ -h f (xJ] +Q(f )+C
In the formula, 9 and hi are the first and second derivatives of the loss function of
the m-th round, respectively;
The simplified objective function formula is as follows:
Obj(") G,,w. + I(1H+ A)8 wWp + yIT b=1 . 2
In which Gi is the sum of the first derivative of the loss function in the m-th round,
and Hi is the sum of its second derivative, and the formula is as follows:
G, = Yg, H, = Yh, sI is
When building a decision tree, the following steps are executed cyclically:
(1) Add one tree in each cycle;
h (yy 9 1 ,ya2, B ,ni) + ())
(2) a) and are calculated at the beginning of each
cycle;
1T G Obj("=-! +yT (3) Greedy algorithm is used to grow trees f" W, 2 Hi ;
(4) Add f (X)to the model and update the GS-XGBoost line loss prediction
model: "
Pay attention to o " as a contraction coefficient, that is, stepping, which means that
we are not doing a complete optimization in each step, leaving room for future
circulation, making the model better to learn and effectively preventing over-fitting.
Input the key feature data in the feature index system into the GS-XGBoost line loss
rate prediction model, and output the line loss rate prediction results.
The analysis and evaluation of the line loss rate prediction result includes three
evaluation indexes, namely MSE, MAE and RMSE, to compare the prediction results.
Mean square error is the average value of the cost function of minimizing sum of
squares of errors (SSE) in linear regression model fitting. The better the prediction effect,
the closer the value is to 0; otherwise, the farther the value is from 0, and its calculation
formula is as follows:
MSE= I yO - y() n =1
In the formula, is the true value and is the predicted value. The smaller
the mean square error, the more accurate the prediction result of the model is.
The average absolute error is calculated as follows:
MAE = y( - POi n
The root mean square error is calculated as follows:
RMSE=
n (i)
) n y0 yPO In which is the number of samples, is the actual value and is the
predicted value.
The loss function of the model takes the Mean Squared Error function expression as
follows:
L(y, 0 -)y2 2
An extreme gradient lifting decision tree for line loss rate prediction is established.
For convenience of explanation, a regression tree is established, and the maximum depth
of the tree is 2.
Input the data from the characteristic index system into the GS-XGBoost line loss
rate prediction model, and get the line loss rate prediction curve and other model
comparison curves, as shown in Figure 5- Figure 7. We compare the prediction accuracy
with XGBoost and RF models. We can see that the GS-XGBoost model performs well in
the fitting degree between the predicted value and the actual value. Compared with the
random forest (RF) model, its prediction performance is obviously higher than that of
XGBoost model. Therefore, the prediction accuracy of GS-XGBoost model is higher than
XGBoost and RF.
Analyze and evaluate the prediction results, as shown in Table 2:
Table 2
Model MSE(%) RMSE(%) MAE(%)
RF 0.1278 3,5748 2.8247
XGBoost 0.1747 4.1793 3.0460
GS-XGBoost 0-1129 3,3597 2.5754
The above table shows that GS-XGBoost model has excellent performance in MSE,
RMSE and MAE.
The above comparison shows that GS-XGBoost model has higher prediction
performance than XGBoost model of the same type and Random Forest (RF) model
which performs well in line loss rate prediction. Thereby verifying the possibility of the
GS-XGBoost model in line loss rate prediction and its excellent prediction performance.
In this embodiment, the ensemble learning algorithm is applied to the prediction of
line loss rate in low-voltage station area, and the prediction accuracy is significantly
improved; The process design of feature index construction and feature selection is novel
and reasonable; It provides a scientific and reasonable basis for specifying loss reduction
planning, thus improving the line loss management level in low-voltage stations, and has
strong practicability and generalization ability.
The above embodiments only describe the preferred mode of the invention, but do
not limit the scope of the invention, on the premise of not departing from the design spirit
of the invention, various modifications and improvements made by ordinary technicians in the field to the technical scheme of the invention shall fall within the protection scope determined by the claims of the invention.

Claims (8)

THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:
1. A method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree,characterized in that,comprising:
Collecting original data of low-voltage station area, and preprocessing the original
data of low-voltage station area to obtain target data of low-voltage station area;
Selecting key features through feature engineering based on the target data of the
low-pressure station area, constructing a low-pressure station area characteristic index
system, and classifying the low-pressure station area based on the low-pressure station
area characteristic index system;
Establishing a second GS-XGBoost prediction model, predicting the line loss rate of
the classified low-voltage station area through the second GS-XGBoost prediction model,
and analyzing and evaluating the prediction results.
2. The line loss rate prediction method of low-voltage station area based on extreme
gradient lifting decision tree according to claim 1, wherein,
Collecting the original data of the low-voltage station area includes obtaining the
cross-sectional area of the main line, the total number of low-voltage meters, the power
supply quantity, the average load rate, the total length of the line, the distribution
transformer capacity and the power factor which reflect the station area and load
characteristics.
3. The method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree according to claim 1, wherein,
The pretreatment process comprises the following steps:
Processing the missing value of the original data of the low-voltage station area
based on a sparse matrix to obtain first data; Performing abnormal data detection on the
first data to obtain second data; Extracting characteristic data based on the second data,
and carrying out standardization processing on the characteristic data to obtain target data
of a low-voltage station area.
4. The method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree according to claim 2, wherein,
The key features of feature engineering screening include:
The feature index weights of the original data of low-voltage station area are
evaluated by F-test filtering method and mutual information method, and the feature set is
obtained by combining MSE, the feature set is input into the first GS-XGBoost prediction
model, the mean square error value is calculated, and the feature set with the smallest
mean square error value is compared and selected as the feature index system of low
voltage station area.
5. The method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree according to claim 4, wherein,
Constructing the low-pressure station characteristic index system also includes
determining the number of key indexes of the low-pressure station characteristic index
system.
6. The method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree according to claim 1, wherein,
Classifying the low-pressure station area comprises:
Determining the number of categories to be clustered and the clustering center by
inputting a low-voltage station data set into the low-voltage station characteristic index
system; The cluster center closest to the low-pressure station sample data is obtained by
calculating the distance from the low-pressure station sample data to the cluster center,
and the low-pressure station sample data is assigned to the nearest cluster center to
complete the classification of the low-pressure station.
7. The method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree according to claim 1, wherein,
Predicting the line loss rate of the low voltage station area comprises:
And constructing the second GS-XGBoost prediction model based on the first GS
XGBoost prediction model and the extreme gradient lifting decision tree, and inputting
the low-voltage station data set into the second GS-XGBoost prediction model to obtain a
line loss rate prediction result.
8. The method for predicting line loss rate in low-voltage station area based on
extreme gradient lifting decision tree according to claim 1, wherein,
The forecast result is analyzed and evaluated by means of mean square error MSE,
mean absolute error MAE and root mean square error RMSE;
The mean square error MSE is the average value of the sum of squares of minimized
errors and cost functions in linear regression model fitting.
AU2021105453A 2021-08-13 2021-08-13 Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree Ceased AU2021105453A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2021105453A AU2021105453A4 (en) 2021-08-13 2021-08-13 Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2021105453A AU2021105453A4 (en) 2021-08-13 2021-08-13 Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree

Publications (1)

Publication Number Publication Date
AU2021105453A4 true AU2021105453A4 (en) 2021-10-14

Family

ID=78007408

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2021105453A Ceased AU2021105453A4 (en) 2021-08-13 2021-08-13 Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree

Country Status (1)

Country Link
AU (1) AU2021105453A4 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723717A (en) * 2021-11-03 2021-11-30 北京清大科越股份有限公司 Method, device, equipment and readable storage medium for predicting short-term load before system day

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723717A (en) * 2021-11-03 2021-11-30 北京清大科越股份有限公司 Method, device, equipment and readable storage medium for predicting short-term load before system day

Similar Documents

Publication Publication Date Title
CN104809658B (en) A kind of rapid analysis method of low-voltage distribution network taiwan area line loss
CN109546659B (en) Power distribution network reactive power optimization method based on random matrix and intelligent scene matching
CN105701596A (en) Method for lean distribution network emergency maintenance and management system based on big data technology
CN106022509A (en) Power distribution network space load prediction method taking region and load property dual differences into consideration
CN111525587B (en) Reactive load situation-based power grid reactive voltage control method and system
CN106997495A (en) A kind of Methods of electric load forecasting
CN113723844B (en) Low-voltage station theoretical line loss calculation method based on ensemble learning
CN106485089A (en) The interval parameter acquisition methods of harmonic wave user&#39;s typical condition
CN104022552A (en) Intelligent detection method for electric vehicle charging control
CN111461921B (en) Load modeling typical user database updating method based on machine learning
CN104239712A (en) Real-time evaluation method for anti-interference performance of radar
AU2021105453A4 (en) Method for forecasting line loss rate in low-voltage station area based on extreme gradient lifting decision tree
CN112308425A (en) Method for constructing distribution transformer health evaluation index system
CN113591322A (en) Low-voltage transformer area line loss rate prediction method based on extreme gradient lifting decision tree
CN112101673B (en) Power grid development trend prediction method and system based on hidden Markov model
CN117057666B (en) Distribution quality evaluation method and system for digital distribution network
CN112950048A (en) National higher education system health evaluation based on fuzzy comprehensive evaluation
CN116308883A (en) Regional power grid data overall management system based on big data
CN114118592B (en) Smart power grids power consumption end short-term energy consumption prediction system
CN115293649A (en) Intelligent fine loss reduction method for regional distribution network
Wu et al. Smart grid terminal security assessment method based on subjective and objective comprehensive weighting
CN114997336A (en) FM-KNN-based intelligent electric meter operation state evaluation method
CN104360948A (en) IEC 61850 configuration file engineering consistency test method based on fuzzy algorithm
Liu et al. Research and application of grid-based integrated evaluation and management of distribution grid
CN112465253A (en) Method and device for predicting links in urban road network

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry