Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a summer peak load prediction method for a power distribution network line based on XGBoost, which can identify a line with low possibility of occurrence of heavy overload, screen out a line with possibility of occurrence of heavy overload in the future as a line to be predicted, so as to reduce the line to be predicted, avoid unnecessary calculation, and establish a load prediction model based on meteorological factors, time factors and spring base load data so as to improve prediction accuracy.
The technical scheme of the invention is as follows:
a summer peak load prediction method of a power distribution network line based on XGBoost comprises the following steps:
(1) Determining a predicted line list and a time period to be predicted, and acquiring maximum allowable current capacity, historical current load data of nearly three to five years and meteorological data of all lines in the predicted line list;
(2) Extracting summer peak load and maximum allowable current-carrying capacity of each line in the predicted line list for nearly three to five years, calculating summer peak load rate, forming a summer peak load rate increase trend curve of each line according to year, and carrying out cluster analysis on the line peak load rate increase trend curve;
(3) Removing lines with low load rate in recent years, mild growth trend and low possibility of future occurrence of heavy load according to the line peak load rate growth trend curve clustering result obtained in the step (2), so as to obtain a screened predicted line list, and leaving lines with high future occurrence of heavy load for the predicted line list;
(4) Selecting a plurality of typical lines from the screened predicted line list, performing XGBoost model training, and determining unified algorithm parameters through parameter adjustment and optimization;
(5) And carrying out load model training and load prediction on all lines in the screened predicted line list by using unified algorithm parameters to obtain line load values at each moment in a period to be predicted, solving the maximum load rate of each line load Gao Fengzhou, and outputting a predicted result list.
Further, the predicted line list in the step (1) includes all power distribution network lines needing peak prediction; the period to be predicted is a summer load Gao Fengzhou of a certain year.
Further, the step (1) historical current load data includes base load data of approximately three to five years for one week in the spring year and load data of summer load Gao Fengzhou; the meteorological data comprises a daily maximum air temperature, a daily minimum air temperature and a daily maximum wind power.
Further, in the step (4), the XGBoost model adopts a framework as a py-XGBoost framework, and XGBoost parameters are set as follows: the maximum number of trees n_identifiers is set to 100, subsamples is set to 1.0, learning rate is set to 0.1, and other parameters are set to default values.
Further, each load value of summer load Gao Fengzhou adopted in the XGBoost model training in the step (4) is taken as a sample value, and the corresponding sample characteristic value data comprises weather characteristic data, time characteristic data and spring load characteristic data.
Further, the meteorological characteristic data comprise the highest temperature, the lowest temperature, the temperature difference and the wind power of each summer load sample value at the corresponding moment; the time characteristic data comprise the year, month, day, time, minute and working day of each summer load sample value corresponding to the moment; the spring load characteristic data comprise spring load peaks of the year in which each summer load sample value corresponds to the moment.
Further, after the load model training of the line to be predicted is completed by using the unified algorithm parameters in the step (5), the characteristic value of each prediction moment in the period to be predicted is formed into prediction input data, and then the trained model is called, so that the line load value of each moment in the period to be predicted can be calculated, the maximum load rate of the line in the prediction period is obtained, and a prediction result list is output.
Compared with the prior art, the invention has the following beneficial effects:
1. the method for predicting the summer peak load of the power distribution network line based on XGBoost is mainly used for solving the problems of large historical data quantity and large load prediction calculation quantity of the line, identifying and eliminating the line with low possibility of heavy overload through collecting and summarizing the data, screening out the line with possibility of heavy overload in the future as the line to be predicted as the line to be calculated and predicted, reducing the line to be predicted through load rate peak value growth trend clustering analysis, avoiding unnecessary calculation and reducing the prediction workload.
2. According to the XGBoost-based power distribution network line summer peak load prediction method, typical lines are selected from screened predicted lines to perform XGBoost model modeling and parameter tuning, and meanwhile, a load prediction model is built based on meteorological factors, time factors and spring base load data, so that prediction accuracy is improved.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a method for predicting summer peak load of a power distribution network line based on XGBoost includes the following steps:
(1) Determining a predicted line list and a time period to be predicted, and acquiring maximum allowable current capacity, historical current load data of nearly three to five years and meteorological data of all lines in the predicted line list; the predicted line list comprises all power distribution network lines needing peak prediction; the period to be predicted is the summer load Gao Fengzhou of a certain year;
in this embodiment, the predicted line list includes 1000 10kV lines of the power distribution network in a certain city, and the period to be predicted is from 29 days of 7 months to 2 days of 8 months in 2019;
in this embodiment, the spring current load data of 2016 to 2019, which is four years in total, and the summer current load data of 2016 to 2018, which is three years in total, are extracted, and the current load data sampling interval is 5 minutes, that is, the load data of 288 time points per day.
(2) Extracting summer peak load and maximum allowable current-carrying capacity of each line in the predicted line list for nearly three to five years, calculating summer peak load rate, forming a summer peak load rate increase trend curve of each line according to year, and carrying out cluster analysis on the line peak load rate increase trend curve; in the embodiment, each line comprises a peak current load value in summer of 2016-2018, the peak current load value is converted into a load rate, K-means clustering is performed, and the number of clustering centers is set to be 10;
(3) Removing lines with low load rate in recent years, mild growth trend and low possibility of future occurrence of heavy load according to the line peak load rate growth trend curve clustering result obtained in the step (2), so as to obtain a screened predicted line list, and leaving lines with high future occurrence of heavy load for the predicted line list;
in this embodiment, since some lines are not full and may not have data in a certain year, only 889 lines containing 2016-2018 summer historical current load data are subjected to line load rate increase trend curve clustering calculation, and the clustering result is shown in fig. 2, the lines are divided into 10 types, in the 10 types of lines, the 1 st type and 9 th type line load rate increase is in an increasing trend year by year, and the possibility of heavy overload in the future is higher; the lines of the class 2, the class 3 and the class 4 are at a lower level, the growing trend is gentle, the class 6 is in a descending trend, and the possibility of heavy load in the future is low; the annual maximum load rate of the class 5 line is at a higher level, and the risk of heavy overload in the future is the largest; class 7, class 8 and class 10 show fluctuation trend, and the future trend change direction is not easy to judge; thus, it is believed that class 2, class 3, class 4, class 6 may not participate in the prediction;
the distribution of the number of lines included in each type of growing trend is shown in table 1, and it can be seen that the number of lines in class 2, class 3 and class 4 is the largest, the total proportion is 458 lines, which indicates that more than half of the lines belong to low-load operation, the growing trend is gentle, and the number of lines in class 6 is 66, which indicates that 7.42% of the lines are in the decreasing trend of the load rate.
Category(s)
|
Class 1
|
Class 2
|
Class 3
|
Class 4
|
Class 5
|
Number of lines
|
43
|
153
|
134
|
171
|
70
|
Category(s)
|
Class 6
|
Class 7
|
Class 8
|
Class 9
|
Class 10
|
Number of lines
|
66
|
59
|
31
|
101
|
61 |
(4) Selecting a plurality of typical lines from the screened predicted line list, performing XGBoost model training, and determining unified algorithm parameters through parameter adjustment and optimization; each load value of summer load Gao Fengzhou adopted by XGBoost model training is taken as a sample value, and corresponding sample characteristic value data comprise weather characteristic data, time characteristic data and spring load characteristic data;
in this embodiment, the weather feature data includes a maximum temperature, a minimum temperature, a temperature difference, and wind power at a time corresponding to each summer load sample value; the time characteristic data comprise the year, month, day, time, minute and working day of each summer load sample value corresponding to the moment; the spring load characteristic data comprise spring load peak values of the year in which each summer load sample value corresponds to the moment, and the sample characteristic value data specifically comprise the following seven types of data:
(1) maximum temperature T h I.e. the highest air temperature on the day;
(2) minimum temperature T l I.e. the lowest air temperature on the day;
(3) temperature difference T f I.e. the difference between the highest air temperature and the lowest air temperature on the same day;
(4) wind power W d I.e. the maximum wind force on the same day;
(5) is thatWork day W k If the day is a working day, the mark is 0, otherwise, the mark is 1;
(6) the current date and time period. Comprising the following steps: current year T y Current month T m Current date T d Current hour T h Current minute T min ;
(7) Spring load peak Lp, the load peak of the current year in spring 3 months;
based on the seven kinds of data, the sample input at each moment is formed, namely x 1 =[T h ,T l ,T f ,W d ,W k ,T y ,T m ,T d ,T h ,T min ,L p ]The load value at the corresponding moment is taken as output y 1 =L 1 The input data of the training data is x= [ X ] 1 ,x 2 ,x 3 ...x n ]The output data of the training data is y= [ Y ] 1 ,y 2 ,y 3 ...y n ]。
In this embodiment, the XGBoost model uses a framework of py-XGBoost, and XGBoost parameters are set as: the maximum number n_identifiers of the trees is set to 100, subsamples are set to 1.0, the learning rate is set to 0.1, and other parameters are set to default values;
in order to compare the model effect, the XGBoost is calculated and compared with the BP neural network and the LSTM algorithm respectively, wherein the BP algorithm and the LSTM algorithm adopt a keras framework, and a load curve after the prediction is carried out on a period to be predicted by using three algorithms for a certain typical line, as shown in figure 3, the comparison shows that the XGBoost parameter setting is relatively simple, and the method has the advantages of high calculation speed, good fitting effect and small prediction error.
(5) Carrying out load model training and load prediction on all lines in the screened predicted line list by using unified algorithm parameters to obtain line load values at each moment in a period to be predicted, solving the maximum load rate of each line load Gao Fengzhou, and outputting a predicted result list;
in the embodiment, curve prediction of the summer peak load cycle of 2019 is carried out on 476 lines to be predicted, and a predicted peak result is compared with an actual load peak value of the lines, wherein 73.7% of the predicted peak value prediction errors of the lines in the predicted result are less than 15%;
and (3) the actual heavy load of the line (the load rate is greater than or equal to 80 percent and is heavy load), and the prediction error is less than 15 percent or the prediction result is heavy load, so that the prediction of the heavy load of the line is considered to be correct. The predicted result of the heavy load line is shown in table 2, and 44 lines with heavy load actually appear in the predicted period, and 32 lines with correct line heavy load prediction account for 72.73% of the total number of heavy load lines. The highest proportion of heavy load in the class 5 line is 25.71%, 18 lines are adopted, and the prediction accuracy is 100%. It can be seen that the class 5 line has the highest probability of heavy load, and the best prediction result.
TABLE 2
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not to be construed as limiting the scope of the invention. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present invention are intended to be included in the scope of the present invention.