CN113869600A

CN113869600A - Peak-valley difference medium-and-long-term prediction method based on random forest and secondary correction

Info

Publication number: CN113869600A
Application number: CN202111210827.1A
Authority: CN
Inventors: 黄奇峰; 方凯杰; 左强; 杨世海; 赵梓舒; 黄艺璇; 刘恬畅; 程含渺; 陈铭明; 李波; 陆婋泉; 曹晓冬; 徐雨森; 臧海祥; 孙国强
Original assignee: State Grid Jiangsu Electric Power Co ltd Marketing Service Center; Hohai University HHU
Current assignee: State Grid Jiangsu Electric Power Co ltd Marketing Service Center; Hohai University HHU
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2021-12-31

Abstract

The invention discloses a random forest and secondary correction-based peak-valley difference medium-and-long term measurement model, which is used for evaluating medium-and-long term demand response implementation effect, collecting historical load data of a plurality of residential users and calculating historical peak-valley differences, and analyzing multi-source influence factors of the load peak-valley differences of users on demand sides; extracting the characteristics of the multi-source influence factors, and extracting the optimal characteristic combination by adopting binary characteristic engineering as the input of a random forest model; constructing a random forest-based peak-valley difference measurement model, and outputting measurement results of the monthly peak-valley difference and the seasonal peak-valley difference; based on historically collected load peak-valley difference data of users on the demand side, the screened related correction factors are selected one by one as input, a Bayesian regression model is constructed so as to realize fitting modeling of the load peak-valley difference of the users, and the medium-long term prediction result of the one-time seasonal peak-valley difference is corrected according to the fitting result. The invention has important significance for promoting the response development of the demand side and relieving the contradiction between power supply and demand.

Description

Peak-valley difference medium-and-long-term prediction method based on random forest and secondary correction

Technical Field

The invention belongs to the technical field of power systems, and relates to a peak-valley difference medium-long term prediction method based on random forest and secondary correction

Background

Under the background of 'carbon peak reaching and carbon neutralization', demand response facing flexible interactive intelligent power utilization becomes a development trend. The resident load is used as an important component of a demand response user, peak clipping and valley filling can be effectively realized, and the reliable and stable operation of the power system is promoted. However, the load characteristics of the demand-side user are affected by various factors such as weather conditions, population growth, and economic development, and it is difficult to perform effective medium-and long-term demand-side evaluation, which affects the reliability of the medium-and long-term demand-side response implementation evaluation. Therefore, the research on how to accurately predict the middle-long-term load peak-valley difference has important significance for promoting the response development of the demand side and relieving the contradiction between power supply and demand.

The work of using peak-to-valley difference prediction as a research focus is very limited. From the existing prediction methods, the prediction methods can be classified into deep learning, statistical models, and machine learning models. The traditional statistical prediction method is easy to implement, and extra input does not need to be acquired. However, in many cases, the accuracy is often limited because only historical data is considered. The deep learning method has good prediction performance and is widely concerned in recent years, but has good prediction accuracy due to the periodicity and the discontinuity of the moon peak valley difference and the quaternary peak valley difference, but is more suitable for continuous time series prediction. The traditional machine learning method is high in calculation speed and strong in generalization capability. The machine learning method comprises a support vector machine, a random forest and the like.

The support vector machine can improve the generalization capability of the learning machine as much as possible, the calculation speed is high, when the binary feature combination optimization is carried out by utilizing the genetic algorithm, the support vector machine can be adopted to carry out peak-valley difference prediction, and the fitness function is set as a loss function value between the prediction after the training of the support vector machine and an actual value. The random forest model is basically a bagging method, robustness is provided for overfitting, and the performance of weak learners (decision trees) is improved through a voting method or an averaging method according to a prediction result. The method has the advantages of difficult overfitting, strong anti-noise capability, high calculation speed and high prediction precision.

Disclosure of Invention

In order to solve the defects in the prior art, the invention aims to provide a peak-valley difference medium-long term prediction method based on random forests and secondary correction, the prediction precision is improved based on the secondary correction, the load peak-valley difference prediction accuracy is improved by using the load side peak-valley difference characteristics, and more reliable guidance is provided for the operation and the scheduling of a power system.

The invention adopts the following technical scheme. The invention provides a method for measuring and calculating peak-valley difference in medium and long periods based on random forests and secondary correction, which comprises the following steps of:

step 1, collecting historical electricity load data of a plurality of residential user areas within a set number of years, calculating historical peak-valley differences, collecting influence factor data influencing the load peak-valley differences, and taking related influence factors as alternative characteristics;

step 2, extracting the characteristics of the multi-source influence factors, extracting an optimal characteristic combination by adopting binary characteristic engineering, and taking the characteristic combination as the input of the random forest model in the step 3;

step 3, training the training data of the optimal characteristic combination selected in the step 2 by using a random forest algorithm to obtain a load peak valley difference measuring and calculating model of the user at the demand side, and outputting a primary medium-long term prediction result of the moon peak valley difference and the season peak valley difference;

and 4, selecting the screened related correction factors one by one as input based on the historically collected peak-valley difference data of the user load on the demand side, constructing a Bayesian regression model so as to realize fitting modeling of the peak-valley difference of the user load, and correcting the medium-long term prediction result of the primary quaternary peak-valley difference according to the fitting result.

Preferably, the historical load data of electricity consumption in step 1 includes daily maximum load data and daily minimum load data; the historical peak-valley difference comprises a daily peak-valley difference, a monthly peak-valley difference and a seasonal peak-valley difference; the influencing factor characteristics include: daily maximum air temperature, daily minimum air temperature, daily average air temperature, air pressure, humidity, rainfall, wind speed, daily average load.

Preferably, step 2 specifically comprises:

step 2.1, calculating the correlation degree between the candidate features and the peak-valley difference in the step 1, and screening n candidate features from high to low according to the correlation degree;

and 2.2, screening the optimal characteristic combination as the input of measurement and calculation by adopting a binary characteristic combination method.

Preferably, step 2.2 specifically comprises:

step 2.2.1, use binary coding to distinguish the use state of the alternative features, i.e. used or abandoned, and screen out the binary feature data set

Step 2.2.2, binary characteristic data set screened out

And as an input of the genetic algorithm, searching for an optimal feature combination based on the genetic algorithm.

Preferably, the binary feature data set

Can be expressed as:

wherein n is the number of the alternative features,

x of the ith feature_iCorresponding binary code is w_i，w_iHas two states of 0 and 1, when w_iWhen 0, this feature is not used; when w is_iWhen 1, this feature is used.

Preferably, step 3 specifically includes:

step 3.1, taking the random forest as a basis for one-time medium and long-term measurement and calculation;

step 3.2, aiming at the historical load data of the user at the demand side, calculating the load natural growth rate monthly and quarterly according to the time scale measured and calculated by the peak-valley difference based on a trend extrapolation method;

3.3, forming a data-driven training sample based on historical average peak-valley difference acquired month by month and season by season, the natural load increase rate and the screened peak-valley difference calculating influence factors; constructing a peak-valley difference measurement model based on random forests and training; the trained model can output the result of the medium-term and long-term measurement and calculation of the load peak-valley difference of the user at the demand side.

Preferably, step 3.1 specifically comprises:

step 3.1.1, setting a data set with optimal combination characteristics in the last N years as an original sample, sampling the original sample by using a bootstrap method, generating K data sets as a training set of a decision tree, wherein N is a positive integer and is less than the set number of years in the step 1;

step 3.1.2, if M input variables originally exist, each node randomly selects M specific variables, and determines the optimal classification point according to the M specific variables, wherein M is less than M;

step 3.1.3, each decision tree is grown to the maximum possible without pruning;

and 3.1.4, taking the average value of all decision trees as a predicted value.

Preferably, step 4 specifically includes:

step 4.1, constructing a Bayesian ridge regression model based on the historical load peak-valley difference of the user at the demand side and the screened related correction factors;

step 4.2, based on the Bayesian ridge regression model, respectively establishing fitting relations between the difference of the quaternary peak and the valley and population correction factors and resident consumption level correction factors, and obtaining population correction fitting curves and resident consumption level fitting curves,

and 4.3, calculating a correction coefficient based on the two fitting curves, and correcting the peak-valley difference of the load of the demand side user in the measurement result of the peak-valley difference of the load of the demand side user obtained in the step 3 in sequence to obtain a correction result of the peak-valley difference of the load of the demand side user.

Preferably, the specific calculation process in the bayesian ridge regression model is as follows:

wherein the content of the first and second substances,

p (w | a, b) is the parameter w distribution probability for conditional features a and b;

a loss function which is a ridge regression consisting in solving so that the ith output y_iAnd input x_ijFamily of parameters between which the fitting error is minimal { beta }_j}；

Penalty loss regular as L2.

Preferably, step 4.3 is to use the correction factor D^*The calculation method of (c) is as follows:

in the formula (I), the compound is shown in the specification,

d is the measured peak-valley difference of a certain quarter of the nth year,

d is the corrected difference of peak and valley in a certain quarter of the nth year,

d_n-1to fit the difference between the peak and valley of the season for a certain quarter of year n-1 on the curve,

d_nthe difference between the peak and valley of the season for a certain quarter of the nth year on the fitted curve.

Compared with the prior art, the method has the advantages that the method is based on the random forest and secondary correction, the peak-valley difference medium-and-long term measurement model prediction method is provided, and the prediction precision is improved based on the secondary correction, aiming at the problems that the load change of a user on the demand side is flexible day by day, the implementation effect of the response on the demand side is difficult to evaluate accurately, and the reliability of the implementation and evaluation of the response on the medium-and-long term demand side depends on the accurate load peak-valley difference prediction.

Drawings

FIG. 1 is a schematic flow chart of a peak-to-valley difference prediction method based on random forest and secondary correction according to the present invention;

FIG. 2 is a schematic diagram of the binary feature combination optimization process of the present invention;

FIG. 3 is a flow chart of one-time medium-long term prediction based on random forests according to the present invention;

FIG. 4 is a flow chart of a quadratic correction based on Bayesian ridge regression according to the present invention;

FIG. 5 is a graph showing a peak-to-valley difference prediction result of a medium-to-long term month using the proposed method according to an embodiment of the present invention;

FIG. 6 is a diagram of the prediction results of the first-order medium-and long-term peak-valley difference and the prediction effect of the second-order corrected peak-valley difference according to the embodiment of the present invention;

FIG. 7 is a population correction fit curve in an embodiment of the present invention;

FIG. 8 is a graph of a fitted residential consumption level curve according to the present invention.

Detailed Description

The present application is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.

In the invention, the medium-long term prediction method refers to a prediction method with the prediction time between one month and one year, and the prediction method can be divided into an ultra-short term (0-6h), a short term (6h-1d) and a medium-long term (1 month-1 year) prediction method according to different prediction time scales.

As shown in FIG. 1, the invention provides a peak-to-valley difference medium-and-long term estimation method based on random forests and secondary correction, which comprises the following steps:

step 1, collecting historical load data of a plurality of residential users within a set number of years, calculating historical peak-valley difference, and collecting influence factor data influencing the load peak-valley difference as alternative characteristics;

in the embodiment, the monthly peak-valley difference and the quarterly peak-valley difference of three residential areas of high, medium and low grades in a certain city in Jiangsu province in China are taken as research objects, historical data of electricity loads of six years from 2013 to 2018 are collected, the data collection time resolution is 15min, 96 data are collected every day, wherein the historical data comprise daily maximum load data and daily minimum load data, the daily peak-valley difference is obtained through calculation according to the daily maximum load data and the daily minimum load data, the monthly peak-valley difference and the quarterly peak-valley difference are further obtained through calculation, and daily average load data are calculated according to the collected 96 daily electricity load historical data. Further, influence factors such as daily maximum air temperature, daily minimum air temperature, daily average air temperature, air pressure, humidity, rainfall, wind speed, daily average load, and the like are considered as candidate characteristics. Population factors and GDP annual influence factors are used as secondary correction factors.

The former five-year data is used as training data, and the last-year data is used as test data to verify the validity of data prediction. The prediction error evaluation indexes are the average absolute percentage error MAPE, the average absolute error MAE and the root mean square error RMSE.

Step 2, extracting the characteristics of the multi-source influence factors, and extracting an optimal characteristic combination by adopting binary characteristic engineering to serve as the input of a random forest model;

the step 2 specifically comprises the following steps:

step 2.1, calculating the correlation degree between the candidate features and the peak-valley difference in the step 1, screening n candidate features from high to low according to the correlation degree,

in order to improve the prediction accuracy, the characteristic factors which have significant correlation with the peak-valley difference of the load on the resident side are screened out in a lean mode by considering multiple influence factors, and a preferable but non-limiting implementation mode is that the correlation degree between the candidate characteristics and the peak-valley difference is quantitatively analyzed by adopting a Persons coefficient. The formula for the Persons coefficient is as follows:

wherein the content of the first and second substances,

x is the difference between the peak and the valley,

y is any influencing factor;

the degree of correlation between the candidate features and the peak-to-valley difference was quantitatively analyzed using a Persons coefficient, as listed in table 1 for the degree of correlation of the candidate features as a combination of binary features,

table 1: degree of correlation between peak-to-valley difference and each influence factor

Because the influence of the related influence factors on the peak-valley difference is combined and the effect of a single factor on the peak-valley difference cannot be analyzed in an isolated manner, the optimal feature combination is screened as the input of the random forest in the candidate features screened in the step 2.1 by adopting a binary feature combination method, and the step 2.2 specifically comprises the following steps:

Setting a total of n candidate features to be screened, wherein the feature data sets and the corresponding binary codes thereof are respectively expressed as:

X＝[x₁，x₂，x₃，...，x_n]

W＝[w₁，w₂，w₃，...，w_n]

wherein n is the number of the alternative features,

x of the ith feature_iCorresponding binary code is w_i，w_iThere are two states, 0 and 1. When w is_iWhen 0, this feature is not used; when w is_iWhen 1, this feature is used. Thus, the screened binary feature data set

Can be expressed as:

step 2.2.2

As the input of the genetic algorithm, the optimal feature combination is searched based on the genetic algorithm,

the optimization diagram of genetic algorithm is shown in FIG. 2, and the genetic algorithm pair is used

When performing the optimizing search, it is necessary to search

As the input of genetic algorithm, the genetic algorithm is initialized randomly, and multiple groups of [ w ] are generated randomly₁，w₂，w₃，...，w_n](ii) an individual. These individuals constitute an initialized population P (0), assuming that the sequence of binary strings l is [ w ]₁，w₂，w₃，...，w_n]In the form of binary codes of (a), these binary string sequences l are called chromosomes in the genetic algorithm, and [ w₁，w₂，w₃，...，w_n]Representative is an individual. Then decoding the coded chromosome to obtain the weight parameter to be optimized carried in the individual, inputting the optimization parameter into the fitness function, adopting the fitness function as the loss function value after SVR training, judging whether the iteration number is reached, if not, selecting two individuals from the population according to the fitness value to copy, and judging the value of the cross probability in the genetic algorithmWhether the two individuals selected next need to do the crossover operation. After the crossover operation, a mutation operation is performed. The mutation operation needs to judge whether the two current temporary individuals need to execute the mutation operation according to the set mutation probability. In general, in genetic algorithms, the magnitude of the mutation probability is calculated by the following formula,

p_m＝0.8/d

in the formula (I), the compound is shown in the specification,

d represents the number of binary symbols after individual binary encoding.

And optimizing the characteristic combination of the peak-valley difference prediction by adopting a binary characteristic combination method for each type of cell. The table below lists the first 3 combinations of features for the three cell datasets. Wherein Top3 indicates that the characteristic combination scheme of Top three is selected, case1, case2 and case3 represent the characteristic combination scheme of the Top three load peak-valley difference and each influence factor of the hierarchical cell. Where case1 is the first feature combination method, Top1, Top1 feature combination was chosen as the optimal feature combination input in the experiment. Check represents selection of the feature, and x represents deletion of the feature. The combination of peak-to-valley difference and characteristics of each influencing factor is shown in table 2,

table 2: characteristic combination scheme of peak-valley difference and various influence factors

And 3, training the training data of the optimal characteristic combination selected in the step 2 by using a random forest algorithm to obtain a load peak valley difference measuring and calculating model of the user at the demand side, and outputting a one-time medium-term and long-term prediction result of the moon peak valley difference and the season peak valley difference. As shown in fig. 3.

The step 3 specifically comprises the following steps:

step 3.1, constructing a random forest model, taking a random forest as a basis for one-time medium and long term measurement and calculation,

the steps of constructing the measuring model by using the random forest algorithm are as follows:

step 3.1.1, setting a data set with optimal combination characteristics in the previous N years as an original sample, sampling the original sample by using a bootstrap method, generating K data sets as a training set of a decision tree, wherein N is a positive integer and is less than the set number of years in the step 1, and the data set in the previous five years is adopted as the original sample in the embodiment;

and 3.1.4, taking the average value of all decision trees as a predicted value.

Step 3.2, aiming at the historical load data of the user at the demand side, calculating the load natural growth rate monthly and quarterly according to the time scale measured and calculated by the peak-valley difference based on a trend extrapolation method; the load natural growth rate is an influence factor representing the load change characteristic and is used as one of the inputs of the load peak-valley difference measurement model of the user at the demand side.

3.3, forming a data-driven training sample based on historical average peak-valley difference acquired month by month and season by season, the natural load increase rate and the screened peak-valley difference calculating influence factors; constructing a peak-valley difference measurement model based on random forests and training; the trained model can output the result of the medium-term and long-term measurement and calculation of the load peak-valley difference of the user at the demand side. And finally, outputting the measurement results of the difference between the moon peak and the valley and the difference between the season peak and the valley.

The prediction result of the method based on the random forest is shown in table 3, and a prediction method based on a Support Vector Machine (SVM), a multilayer perceptron (MLP) and a Gaussian regression process GPR is selected as a reference prediction method for verifying the effectiveness of the method provided by the invention in improving the prediction precision. From tables 3 and 4, the prediction accuracy of the random forest can be seen, all error indexes of other models are compared comprehensively, and the prediction effect of the monthly peak-valley difference and the seasonal peak-valley difference of the random forest is the best.

Table 3: moon peak valley difference prediction result based on different models

Table 4: prediction result of difference between peaks and valleys based on different models

Due to the fact that the time span of the quaternary peak-valley difference is large, influence of influence factor difference between years on the quaternary peak-valley difference is considered, Bayesian ridge regression calculation is built, fitting characteristics of relevant correction factors and load peak-valley difference are obtained, and the medium-long term prediction result of the primary quaternary peak-valley difference is corrected. The second correction process is shown in fig. 4. The secondary peak-valley difference correction stage comprises the following steps:

step 4.1, constructing a Bayesian ridge regression model based on the historical load peak-valley difference of the user at the demand side and the screened related correction factors, wherein the specific calculation process in the Bayesian ridge regression model is as follows:

wherein the content of the first and second substances,

Penalty loss regular as L2.

And 4.2, respectively establishing fitting relations between the difference of the quaternary peaks and the valley and population correction factors and resident consumption level correction factors based on a Bayesian ridge regression model, and obtaining population correction fitting curves and resident consumption level fitting curves as shown in FIGS. 6 and 7.

Step 4.3, sequentially calculating correction coefficients based on the two fitting curves, and correcting the peak-valley difference of the load of the demand side user in the measurement result of the peak-valley difference of the load of the demand side user obtained in the step 3 to obtain a correction result of the peak-valley difference of the load of the demand side user;

correction factor D^*The calculation method of (c) is as follows:

in the formula (I), the compound is shown in the specification,

d is the measured peak-valley difference of a certain quarter of the nth year,

Taking population correction of the peak-valley difference of the spring of 2018 as an example, knowing population data of 2017 and 2018, finding corresponding peak-valley difference values d2017 and d2018 on a fitting curve, calculating a correction coefficient and correcting the peak-valley difference of the spring of 2018:

wherein D is the peak-to-valley difference in 2018 spring measured,

d is the corrected peak-to-valley difference in 2018 spring,

d₂₀₁₇the peak-to-valley difference value in the spring of 2017,

d₂₀₁₈the peak-to-valley difference value in spring of 2018.

Comparing the predicted value and the true value of the load at the moment to be predicted, and calculating error indexes MAPE, MAE and RMSE according to the following formulas:

wherein the content of the first and second substances,

l_eis the true value of the load at a certain moment,

respectively, are predicted values of the load at a certain moment,

n_testthe number of test samples;

in addition, the peak-valley difference medium-and-long-term measurement method based on the random forest and the secondary correction, as shown in table 5, is a prediction result of the quaternary peak-valley difference after the secondary correction.

The result shows that a certain deviation exists between the primary predicted value and the true value, but the predicted result after secondary correction has more obvious performance improvement compared with a first stage. The model after secondary correction has better prediction effect. By comparing the evaluation index results, the RMSE index, the MAPE index and the MAE index of the prediction result after secondary correction are all reduced compared with the primary prediction, and the prediction precision is effectively improved. This shows that the peak-valley difference twice prediction model improves the sensitivity of the model to the difference of the influence factors of different years to a certain extent, thereby further improving the accuracy of the final prediction result.

Fig. 5 is a primary medium-and-long term monthly peak-to-valley difference prediction result, and fig. 6 is a primary medium-and-long term seasonal peak-to-valley difference prediction result and a secondary corrected seasonal peak-to-valley difference prediction effect. The adopted secondary peak-valley difference prediction model not only considers the influence factors of the peak-valley difference from month to month and from quarter to quarter, but also considers the difference of the influence factors between years. The peak-valley difference is predicted through two stages of primary medium-long term prediction and secondary correction, so that the prediction precision is greatly improved. Finally, the superiority of the method provided by the research is proved by example analysis.

Table 5: two comparisons of predicted peak-to-valley difference

In conclusion, the prediction method can be used for predicting the load peak-valley difference on the demand side, and plays an important guiding role in power system scheduling, energy management and demand response implementation. Compared with other prediction methods, the method provided by the invention utilizes the annual influence factor difference to carry out secondary correction, so that the prediction precision is obviously improved, the load peak-valley difference of the resident user can be more accurately predicted, and the method has important significance for promoting the response development of the demand side and relieving the power supply and demand contradiction.

The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.

Claims

1. The peak-valley difference medium-long term measurement and calculation method based on random forest and secondary correction is characterized by comprising the following steps of:

2. The method for calculating the peak-to-valley difference based on random forest and quadratic correction as claimed in claim 1,

the historical load data of the electricity consumption in the step 1 comprise daily maximum load data and daily minimum load data; the historical peak-valley difference comprises a daily peak-valley difference, a monthly peak-valley difference and a seasonal peak-valley difference; the influencing factor characteristics include: daily maximum air temperature, daily minimum air temperature, daily average air temperature, air pressure, humidity, rainfall, wind speed, daily average load.

3. The method for calculating the peak-to-valley difference based on random forest and quadratic correction as claimed in claim 1,

the step 2 specifically comprises:

4. The method for calculating the peak-to-valley difference based on random forest and quadratic correction as claimed in claim 2,

the step 2.2 specifically comprises:

Step 2.2.2, binary characteristic data set screened out

5. The method for calculating the peak-to-valley difference based on random forest and quadratic correction as claimed in claim 4,

the binary feature data set

Can be expressed as:

wherein n is the number of the alternative features,

6. The method for calculating the peak-to-valley difference based on the random forest and the secondary correction as claimed in claim 1, wherein:

the step 3 specifically includes:

3.3, forming a data-driven training sample based on historical average peak-valley difference acquired month by month and season by season, the natural load increase rate and the screened peak-valley difference calculating influence factors; constructing a peak-valley difference measurement model based on random forests and training; the trained model can realize the output of the result of the medium-term and long-term measurement and calculation of the load peak-valley difference of the user at the demand side.

7. The method for calculating the peak-to-valley difference based on random forest and quadratic correction as claimed in claim 4,

step 3.1 specifically comprises:

and 3.1.4, taking the average value of all decision trees as a predicted value.

8. The method for calculating the peak-to-valley difference based on random forest and quadratic correction as claimed in claim 1,

the step 4 specifically comprises the following steps:

step 4.2, based on a Bayesian ridge regression model, respectively establishing fitting relations between the difference of the quaternary peak valley and population correction factors and between the difference of the quaternary peak valley and the population correction factors and between the difference of the residential consumption levels and obtaining population correction fitting curves and residential consumption level fitting curves;

9. The method for mid-to-long term measurement of peak-to-valley difference based on random forest and quadratic correction as claimed in claim 8,

the specific calculation process in the Bayesian ridge regression model is as follows:

wherein the content of the first and second substances,

Penalty loss regular as L2.

10. The method for mid-to-long term measurement of peak-to-valley difference based on random forest and quadratic correction as claimed in claim 9,

step 4.3 correction factor D^*The calculation method of (c) is as follows:

in the formula (I), the compound is shown in the specification,

d is the measured peak-valley difference of a certain quarter of the nth year,