CN110969285B - Prediction model training method, prediction device, prediction equipment and medium - Google Patents

Prediction model training method, prediction device, prediction equipment and medium Download PDF

Info

Publication number
CN110969285B
CN110969285B CN201911038724.4A CN201911038724A CN110969285B CN 110969285 B CN110969285 B CN 110969285B CN 201911038724 A CN201911038724 A CN 201911038724A CN 110969285 B CN110969285 B CN 110969285B
Authority
CN
China
Prior art keywords
power load
prediction
influence factor
training
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911038724.4A
Other languages
Chinese (zh)
Other versions
CN110969285A (en
Inventor
郝吉芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOE Technology Group Co Ltd
Original Assignee
BOE Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOE Technology Group Co Ltd filed Critical BOE Technology Group Co Ltd
Priority to CN201911038724.4A priority Critical patent/CN110969285B/en
Publication of CN110969285A publication Critical patent/CN110969285A/en
Application granted granted Critical
Publication of CN110969285B publication Critical patent/CN110969285B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The application discloses a prediction model training method, a prediction device and a prediction medium, wherein the training method comprises the following steps: acquiring a sequence value of a historical power load and a plurality of influence factors corresponding to the sequence value; extracting power load characteristics and influence factor characteristics from the sequence values and the influence factors; and training a prediction model based on a Catboost algorithm by using the power load characteristics and the influence factor characteristics, wherein the prediction model is used for predicting the power load of the next time period of the prediction target. According to the embodiment of the application, the power load characteristics and the influence factor characteristics are extracted from the obtained historical sequence values and the corresponding influence factors, the extracted power load characteristics and the extracted influence factor characteristics are trained to obtain the prediction model, so that the characteristic processing in the training process is simple, the prediction model obtained through training is accurate in short-term power load prediction of enterprise users, and a reliable basis is provided for power spot market transaction.

Description

Prediction model training method, prediction device, prediction equipment and medium
Technical Field
The present application relates generally to the field of computer technology, and more particularly, to a predictive model training method, a predictive method, an apparatus, a device, and a medium.
Background
In the industrial enterprise, the power load required in the production process is usually bought from the electricity selling enterprise based on the medium and long term power load, and the electricity selling enterprise reports the future medium and long term power consumption from the power generation side in advance. In practice, the short term power demand of the industrial enterprise user does not necessarily correspond to the pre-plan. At this time, the electricity selling enterprises need to sell or supplement the surplus or insufficient lightening through the spot market to realize the short-term supply and demand balance of the electric power. In the electric power spot market, as the number of trade varieties increases, the frequency increases, the price fluctuation is more frequent, and under the novel rules of 'real-time market deviation settlement' and 'deviation price difference profit transfer settlement', the short-term and even ultra-short-term load forecasting capacity needs to be improved so as to accurately master the deviation allowed by the market and maximize the trade profit.
At present, the algorithms used for predicting the medium-short term or ultra-short term power loads of industrial users are various, such as traditional trend extrapolation, regression analysis and the like, artificial intelligent random forest algorithms, neural network algorithms (RNN), long-short term memory network algorithms (LSTM) and the like.
The short-term power load of the industrial enterprise is predicted by the algorithm, so that the characteristic processing process in the prediction process is complex, the prediction result and the actual power load are large in entrance and exit, and a reference basis cannot be provided for the power spot market transaction.
Disclosure of Invention
In view of the above-mentioned drawbacks and deficiencies of the prior art, it is desirable to provide a power load prediction model training method, a prediction method, an apparatus, a device and a medium for improving the accuracy of enterprise power load prediction.
In a first aspect, an embodiment of the present application provides a predictive model training method, where the method includes:
acquiring a sequence value of a historical power load and a plurality of influence factors corresponding to the sequence value;
extracting power load characteristics and influence factor characteristics from the sequence value and the influence factor;
and training a prediction model based on a Catboost algorithm by using the power load characteristic and the influence factor characteristic, wherein the prediction model is used for predicting the power load of a prediction target in the next time period.
In a second aspect, an embodiment of the present application provides a prediction method, where the method includes:
acquiring a sequence value of a historical power load of a prediction target and an influence factor corresponding to the sequence value;
inputting the sequence value and the influence factor into the prediction model trained according to the first aspect, and obtaining the power load of the prediction target in the next time period.
In a third aspect, an embodiment of the present application provides a prediction model training apparatus, including:
the system comprises an acquisition module, a processing module and a control module, wherein the acquisition module is used for acquiring a sequence value of historical power loads and a plurality of influence factors corresponding to the sequence value;
the extraction module is used for extracting power load characteristics and influence factor characteristics from the sequence value and the influence factor;
and the training module is used for training a prediction model based on a Catboost algorithm by utilizing the power load characteristic and the influence factor characteristic, and the prediction model is used for predicting the power load of a prediction target in the next time period.
In a fourth aspect, an embodiment of the present application provides a prediction apparatus, including:
the acquisition module is used for acquiring a sequence value of the historical power load of the prediction target and an influence factor corresponding to the sequence value;
and the predicting module is used for inputting the sequence value and the influence factor into the predicting model trained according to the first aspect to obtain the power load of the predicted target in the next time period.
In a fifth aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the method according to the first aspect or the second aspect.
In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, the computer program being configured to implement the method according to the first aspect or the second aspect.
According to the prediction model training method, the prediction device, the prediction equipment and the prediction medium, the sequence value and the corresponding influence factor of the historical power load of the enterprise user in a short period are obtained, the power load characteristic and the influence factor characteristic are extracted from the obtained sequence value and the corresponding influence factor, the Catboost algorithm is further utilized to train the obtained power load characteristic and the influence factor characteristic, the prediction model is obtained, the characteristic processing process is simple, the prediction model can accurately predict the short-period power load of the enterprise user, and a reliable basis is provided for the power spot market transaction. And the prediction precision reaches a higher level
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a schematic flow chart illustrating a predictive model training method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart illustrating a predictive model training method according to yet another embodiment of the present application;
FIG. 3 is a schematic flow chart diagram illustrating a predictive model training method according to yet another embodiment of the present application;
FIG. 4 is a schematic flow chart illustrating a power load prediction method according to an embodiment of the present disclosure;
FIG. 5 is a schematic structural diagram of a prediction model training apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a power load prediction apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a computer system of a terminal device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant disclosure and are not limiting of the disclosure. It should be further noted that, for the convenience of description, only the portions relevant to the disclosure are shown in the drawings.
It should be noted that, in the present application, the embodiments and the technical features in the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
The power load prediction method provided by the embodiment of the application is used for enterprise users in industrial production, in order to realize accurate prediction of future short-term power loads of the enterprise users, historical data of the power loads of the enterprise users and historical data of other influence factors influencing the power utilization are collected, the collected historical data are preprocessed, characteristics are extracted to generate a training set and a test set for model prediction training, and finally, a machine model for predicting the future short-term power loads of the enterprise users can be trained by utilizing the obtained training set and test set, so that accurate prediction of the future short-term power loads of the industrial enterprises can be realized by utilizing the trained models. Such as predictions of future short-term electrical loads, such as predictions of one hour, one day, or one week future electrical usage.
It can be understood that, in the embodiment of the present application, the collected historical data is used for prediction model training, and the prediction model training can be implemented based on a Catboost algorithm, that is, based on a machine learning framework of a gradient boosting decision tree. Specifically, the best parameters of the prediction model can be determined by using the collected training set and test set through a hyper-parameter optimization algorithm, so as to obtain the best prediction model. Therefore, the range of values of the model parameters can be set in advance at the beginning of training. Therefore, in the model training process, the optimal model parameter is determined through continuous iterative training of the training set based on the preset value range of the model parameter.
For convenience of understanding and explanation, the method, device, apparatus and medium for training the prediction model of the power load according to the embodiment of the present application are described in detail below with reference to fig. 1 to 7.
Fig. 1 is a schematic flow chart of a power load prediction model training method provided in an embodiment of the present application, where the method includes:
s110, acquiring a sequence value of the historical power load and an influence factor corresponding to the sequence value.
Specifically, the prediction model training method provided by the embodiment of the application is used for predicting the future short-term power load of the enterprise user. The sequence value of the historical power load of the enterprise user and the influence factor related to the sequence value can be obtained firstly. For example, if a certain enterprise needs to be trained to obtain a prediction model for predicting the power load of one day, several hours or several weeks in the future, the sequence value of the power consumption of each hour in the past day or week of the enterprise user and the influence factors such as the weather factor and/or the time factor corresponding to the sequence value of the hour may be obtained, and the influence factors may include data information and/or text information. Meteorological factors may include information such as temperature, humidity, and weather. The time factor may include information such as the corresponding time of day, date, week, and/or whether it is a holiday.
S120, preprocessing the sequence value and the influence factor corresponding to the sequence value.
Specifically, after the sequence value of the original power load and the related impact factor are obtained, the sequence value or the impact factor needs to be preprocessed.
As shown in fig. 2, the method may specifically include the following steps:
and S121, deleting abnormal values in the sequence values of the historical power loads.
And S122, converting the text information in the influence factors into numerical information.
And S123, normalizing the numerical information in the influence factors.
Specifically, for the sequence values of the acquired historical power loads, due to the large number, there may be abnormal values, such as a surge or a dip caused by an impact of the production process of the enterprise user on the power loads. Therefore, the deviation of each power load value in the sequence of values from the average value can be calculated based on the average value of the power loads for the period of time, and if the deviation is too large, such as greater than 2 times the standard deviation, the power load value is an abnormal value, and the deletion process is performed.
The text information in the acquired influence factors can be processed numerically. For example, the obtained weather information in the weather factors, including sunny, cloudy and rainy days, may be respectively marked as 1, 0.5 and 0. Whether the acquired time factors are holidays or not can be marked as 1 if yes, and can be marked as 0 if not; the week in the time factor may be monday through sunday, labeled as 1, 2, 3, 4, 5, 6, and 7 in order.
It can be understood that the above numeralization of the text information can be flexibly set according to the situation, and the present application does not limit the present application in detail.
For the acquired data information in the meteorological factors, such as temperature, humidity or wind speed values, etc., the values are in a large range. Therefore, the method can be normalized and converted into a real number between 0 and 1 to eliminate the dimensional influence. For example, the numerical value can be normalized by the following formula:
Figure 593740DEST_PATH_IMAGE001
as described above
Figure 860774DEST_PATH_IMAGE002
Figure 124920DEST_PATH_IMAGE003
Are respectively the numerical values before and after normalization,
Figure 793799DEST_PATH_IMAGE004
Figure 787163DEST_PATH_IMAGE005
the maximum and minimum values of the feature in the training sample are respectively.
It can be understood that, in the embodiment of the present application, the execution sequence of the three steps of the preprocessing is not limited, and the preprocessing may be executed synchronously, or sequentially, or one or two of the preprocessing may be selectively executed, or all the preprocessing is not executed, that is, after S110, S130 may be directly executed, which is determined specifically according to the actual situation.
It can be understood that the Catboost algorithm adopted in the embodiment of the present application preprocesses the collected sequence values and the influence factors, so that the feature processing and the feature numeralization become simple, the dependence on the feature engineering is weakened, and the prediction accuracy reaches a higher level.
S130, extracting the power load characteristic and the plurality of influence factor characteristics from the sequence value and the influence factor.
Specifically, after obtaining the sequence value and the influence factor, or further performing preprocessing on the sequence value and the influence factor corresponding to the sequence value, an original data set serving as a sample set can be obtained. The original data set comprises a plurality of pieces of sample data, and each piece of sample data comprises a power load value and a plurality of influence factors corresponding to the power load value, such as the power load value at a certain moment, and a meteorological numerical value and a time numerical value corresponding to the power load value.
After the initial feature set is obtained, feature extraction may be performed on sample data in the initial feature set. That is, the power load characteristic is extracted from the sequence value, and the influence factor characteristic corresponding to the power load characteristic is extracted from the data values of the plurality of influence factors, thereby obtaining an initial characteristic set.
For example, for the power load data, if the prediction model to be trained is used for predicting the power load within several hours in the future, a plurality of power load characteristics of the previous, second, and third hours at the moment, the power load characteristics of the same time of the previous day, the maximum, minimum, and average power load characteristics of the current day, the maximum, minimum, and average power load characteristics of the previous day, and the like may be extracted. For meteorological factors, the maximum, minimum, average temperature and humidity characteristics of the corresponding day can be extracted, so that the characteristics of sunny days, cloudy days, rainy days and the like can be concentrated in the characteristics. For the time factor, the characteristics of the corresponding date, whether the date is a holiday, a week and the like can be extracted.
If the trained predictive model is used for prediction of the power load for the next several weeks, the maximum, minimum and average values of the power load for each day in the previous several days or weeks may be extracted. For meteorological factors, the maximum, minimum, average temperature and humidity characteristics of the corresponding day can be extracted, namely the days of sunny days, cloudy days, rainy days and the like in the characteristic set. For the time factor, characteristics such as corresponding date, whether the date is a holiday, week and the like can be extracted.
Optionally, if the sequence value distribution of the power load has periodicity, the features in the period may be simply extracted according to the periodic characteristics of the sequence value distribution. For example, whether the data has periodicity can be determined by visualizing the sequence value, such as visualizing the data for one or several weeks by using a line graph or a scatter graph, and then analyzing whether the data has periodicity from the graph. If the period is one week, the power load at the same time of the previous week 1, 2, \ 8230, 6 days or the power load at the previous, second, third, or hour can be extracted as the power load characteristics according to the periodicity.
It can be understood that the initial feature set formed by the extracted features also includes a plurality of sample data, each sample data includes a power load feature and an influence factor feature, and the influence factor feature is a category feature.
For example, as shown in table 1, the extracted feature set includes three sample data, each sample data includes a power load feature and a plurality of impact factor features, that is, a power load feature and a plurality of impact factor features:
numbering Time Electric load Weather (weather) Temperature of Humidity Week
1 2019-09-23 16:00:00 100 0.1 (sunny) 30 59 1
2 2019-09-23 17:00:00 101 0.05 (duo Yun) 29 59 1
3 2019-09-23 18:00:00 102 0.01 (rain) 28 58 1
And S140, training a prediction model based on a Catboost algorithm by using the power load characteristics and the influence factor characteristics, wherein the prediction model is used for predicting the power load of the next time period of the prediction target.
Specifically, after a sample set is obtained, the sample set can be divided into a training set and a test set according to a preset proportion, and then a prediction model can be obtained by training through a Catboost algorithm by using the obtained training set and test set, namely, model parameters of the prediction model are determined.
As shown in fig. 3, the specific training process is as follows:
s141, based on preset combination rules, combining the impact factor characteristics corresponding to the power load characteristics to generate a plurality of combination characteristics corresponding to each combination rule, where one or more of the combination characteristics correspond to one power load characteristic.
And S142, training the power load characteristics and the plurality of combined characteristics by using a hyper-parameter optimization algorithm to obtain the prediction model.
And S143, analyzing the combination characteristics corresponding to the prediction model.
Specifically, in the training process, in order to enable all the acquired features to be used for model training and reduce noise caused by low-frequency features in class features, various sample data in a feature set can be randomly ordered first. For example, for the results after random permutation of the feature sets in table 1:
numbering Time Electric load Weather (weather) Temperature of Humidity Week
3 2019-09-23 18:00:00 102 0.01 (rain) 28 58 1
1 2019-09-23 16:00:00 100 0.1 (sunny) 30 59 1
2 2019-09-23 17:00:00 101 0.05 (duo Yun) 29 59 1
Further, in order to improve the fitting ability of the sample set so that the features therein represent the nonlinear relationship to the greatest extent, for the class features of the impact factors, two or more impact factor features corresponding to the power load features may be combined according to a preset combination rule, and one or more combination features corresponding to the power load features under each combination rule are generated.
The combination rule represents a combination rule among the class features, i.e., a combination object of the influence factor features, and a combination manner. That is, the combination rule indicates which two or more features between the category features are combined and the manner of combination between the combination objects.
Optionally, in the feature combination, a combination object may be determined first based on each combination rule, and the combination object may include any two or more influence factor features, such as a weather feature and a temperature feature, or a weather feature and a humidity feature, or a weather feature, a temperature feature, and a humidity feature, in the influence factor features shown in table 2. And then, according to the determined combination object, calculating the product of numerical values of any two or more influence factor characteristics, and generating a plurality of combination characteristics corresponding to the power load characteristics under each combination rule. Such as calculating the product of the weather and the temperature value, or the product of the weather and the humidity value, or the product of the weather, the temperature and the humidity value corresponding to each power load characteristic.
For example, based on the feature sets shown in table 1, the influence factor features of each power load are combined according to different combination rules to obtain a plurality of combination features, for example, weather and temperature, and weather and humidity are combined by multiplying the values corresponding to the combination objects, and the obtained plurality of combination features are as follows.
Numbering Time Electric load Weather (weather) Temperature of Humidity Week(s) Weather-temperature Weather-humidity Temperature-humidity
1 2019-09-23 16:00:00 100 0.1 (sunny) 30 59 1 0.1*30 = 3 0.1*59=0.59 30*59=1770
2 2019-09-23 17:00:00 101 0.05 (duo Yun) 29 59 1 0.05*29 =1.45 0.05*59=1.45 29*59=1711
3 2019-09-23 18:00:00 102 0.01 (rain) 28 58 1 0.01*28=0.28 0.01*58=0.58 28*58=1624
It can be understood that the more features are combined in the combination process, the better the prediction effect is, so that all the features are combined in order to obtain a plurality of feature combinations. Then, each combination rule corresponds to a group of combination features, and each group of combination features includes a plurality of combination features corresponding to each power load. The combination rules shown in table 3 are a weather feature-temperature feature, a weather feature-humidity feature set, a temperature feature, and a humidity feature, and are combined in a multiplication manner to obtain three combination features corresponding to the power load of number 1, three combination features corresponding to the power load of number 2, and three combination features corresponding to the power load of number 3. All of the power loads and combined features described above constitute the input feature set for training, including the training set and the test set.
Further, after feature combination is performed to obtain a plurality of combination features, i.e., a plurality of input feature sets, corresponding to each combination rule, automatic training may be performed by using a hyper-parameter optimization algorithm (e.g., scikit-optimization-bayesian searchcv), i.e., performing iterative training on the input feature set corresponding to each combination rule for a plurality of times to obtain model parameters corresponding to each input feature set.
It will be appreciated that in the Catboost algorithm, the model parameters may include:
learning rate, i.e. the step size of gradient descent;
subsample, namely the sampling rate of each iteration to the training sample;
colsample _ byte is the sub-sampling rate of each tree;
max _ depth is the maximum depth of the tree;
min _ child _ weight is the minimum sample weight of the leaf node;
reg _ lambda is L2 regularization coefficient;
reg _ alpha is L1 regularization coefficient;
n _ estimators, maximum number of trees;
scoring is an evaluation criterion;
n _ splits is the number of cross-validation;
n _ jobs is the number of parallel rows;
n _ iter is the number of iterations;
the verbose is the length of the log redundancy;
random _ state, random _ state.
It can be understood that, during actual training, an operator can set the value range of each model parameter according to experience, so that the optimal model parameter is determined from the value range in the training process.
It can also be understood that, in the actual training process of each input feature set, the iteration number, the root mean square error of the corresponding training set and the test set can be calculated at any time, and if the root mean square error of the training set is higher and the root mean square error of the test set is lower, which indicates that the current input feature set is over-fitted, retraining can be performed.
Finally, after training the combined features corresponding to all the combination rules to obtain the prediction model corresponding to each combination rule, and after obtaining the prediction model corresponding to each input feature set, each prediction model can be evaluated by using the test set to determine the optimal prediction model, i.e. determine the optimal combined features.
For example, the test set in each input feature set may be input into a corresponding prediction model, the root mean square error corresponding to each test set may be determined, and the root mean square error and the iteration number of the training set may be combined in the prediction model training process. And then, from the angle of balance between the operation time and the accuracy, analyzing and selecting a combination with a higher root mean square error of the training set, a higher root mean square error of the testing set and a lower iteration number as a target prediction model.
It can be understood that the target prediction model is the prediction model to be trained in the embodiment of the present application, and the corresponding combined features of the prediction model are the optimal features.
Further, after the training is completed, the combined features corresponding to the target prediction model may be analyzed, for example, by using a visual method, so that the related personnel of the enterprise user can know the influence factors and the combination rules influencing the prediction result.
According to the prediction model training method provided by the embodiment of the application, historical power load characteristics of industrial enterprise users and influence factor characteristics corresponding to the power load characteristics are obtained, the obtained influence factor characteristics are combined according to preset combination rules to obtain combination characteristics corresponding to the power load characteristics, and the combination characteristics corresponding to the power load characteristic sets are trained by using a hyper-parameter optimization algorithm to construct an optimal prediction model, so that the characteristic processing and the characteristic numeralization are simple and easy in the training process, the dependence on characteristic engineering is weakened, the prediction precision reaches a higher level, and a reliable basis is provided for power spot market transaction.
Further, in the application, after the obtained historical data of the enterprise user is trained through the method to obtain the prediction model, the trained prediction model can be used for predicting the future short-term power utilization of the enterprise user.
It can be understood that after the prediction model is trained, in the prediction stage, the obtained sequence value and the influence factor of the power load can be directly input into the trained prediction model as input data of model training to output a prediction result, that is, the future short-term power load of the enterprise user is output. Alternatively, a historical power load sequence value may be additionally acquired as an input value for predicting future power utilization. At this time, when the output prediction result is not ideal, the prediction model can be retrained by using the additionally acquired power load sequence value and the influence factor.
Fig. 4 is a schematic flowchart of a power load prediction method according to an embodiment of the present application, and as shown in fig. 4, the method includes:
s410, acquiring a sequence value of the historical power load of the prediction target and an influence factor corresponding to the sequence value.
And S420, preprocessing the sequence value and the influence factor.
And S430, inputting the preprocessed sequence value and the influence factor into a pre-trained prediction model, and outputting a prediction result of the prediction target.
Specifically, in the embodiment of the present application, when the enterprise user predicts the future medium-short term power load, the sequence value of the historical power load, such as the power load at the scheduled time of the past day, two days, three days, or one week, may be collected. And collects the impact factors corresponding to the power load value, such as the meteorological information and time information at the predetermined time. The collected sequence values and impact factors may then be preprocessed, such as to remove outliers, to digitize text information, and/or to normalize values in the impact factors. Finally, the sequence value and the influence factor of the preprocessed power load can be input into a pre-trained prediction model, so that the prediction model outputs a predicted value, and the power load of a medium-short term in the future, such as the power load of a day, a day or a week in the future, can be obtained.
According to the power load prediction method provided by the embodiment of the application, the sequence value of the historical power load of the enterprise user and the influence factor corresponding to the sequence value are obtained, and the obtained sequence value and the obtained influence factor are input into the pre-trained prediction model, so that the prediction model outputs the power load of the enterprise user in the future time period, the accurate prediction of the future medium-short term power utilization is realized, and a reliable basis is provided for the power spot market transaction.
On the other hand, as shown in fig. 5, a schematic structural diagram of a prediction model training apparatus provided in the embodiment of the present application is shown, and as shown in fig. 5, the apparatus 500 includes:
an obtaining module 510, configured to obtain a sequence value of a historical power load and a plurality of influence factors corresponding to the sequence value;
an extracting module 520, configured to extract a power load characteristic and an impact factor characteristic from the sequence value and the impact factor;
a training module 530, configured to train a prediction model based on a castboost algorithm by using the power load characteristic and the impact factor characteristic, where the prediction model is used to predict a power load of a prediction target in a next time period.
Optionally, in the prediction model training method provided in the embodiment of the present application, the training module specifically includes:
a combining unit 531, configured to combine the impact factor features corresponding to the power load features based on a preset combining rule, and generate a plurality of combined features corresponding to the power load features;
a determining unit 532, configured to train the power load characteristics and the corresponding multiple combined characteristics by using a hyper-parameter optimization algorithm, so as to obtain the prediction model.
Optionally, in the prediction model training method provided in the embodiment of the present application, the combination unit is specifically configured to:
determining any two or more influence factor characteristics to be combined corresponding to the power load characteristics based on a preset combination rule;
and calculating the product of the numerical values of the two or more influence factor characteristics to be combined to generate a plurality of combined characteristics corresponding to the power load characteristics.
Optionally, in the method for training a prediction model provided in the embodiment of the present application, the training module specifically includes that the influence factor includes a meteorological factor and/or a time factor
On the other hand, as shown in fig. 6, a schematic structural diagram of the power load prediction apparatus according to the embodiment of the present application is provided, and as shown in fig. 6, the apparatus 6 includes:
an obtaining module 610, configured to obtain a sequence value of a historical power load of a prediction target and an influence factor corresponding to the sequence value;
and a prediction module 620, configured to input the sequence value and the influence factor into the prediction model trained as described above, so as to obtain the power load of the prediction target in the next time period.
Optionally, the prediction apparatus provided in the embodiment of the present application further includes:
a preprocessing module 630, configured to preprocess the sequence value and the impact factor.
On the other hand, the terminal device provided in the embodiments of the present application, which may be a terminal device of a viewer or a terminal device of a broadcaster, includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the prediction model training method or the power load prediction method as described above.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer system 300 of a terminal device according to an embodiment of the present application.
As shown in fig. 7, the computer system 300 includes a Central Processing Unit (CPU) 301 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage section 303 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the system 300 are also stored. The CPU 301, ROM 302, and RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
The following components are connected to the I/O interface 305: an input portion 306 including a keyboard, a mouse, and the like; an output portion 307 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 308 including a hard disk and the like; and a communication section 309 including a network interface card such as a LAN card, a modem, or the like. The communication section 309 performs communication processing via a network such as the internet. A drive 310 is also connected to the I/O interface 305 as needed. A removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 310 as necessary, so that a computer program read out therefrom is mounted into the storage section 308 as necessary.
In particular, according to embodiments of the present application, the processes described above with reference to flow diagrams 2-5 may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 303, and/or installed from the removable medium 311. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 301.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present application may be implemented by software or hardware. The described units or modules may also be provided in a processor, and may be described as: a processor includes an acquisition module, an extraction module, and a training module. The names of these units or modules do not form a limitation on the units or modules themselves in some cases, for example, the training module may be further described as "training a prediction model based on the castboost algorithm for predicting the power load of the next time period of the prediction target by using the power load characteristics and the influence factor characteristics".
As another aspect, the present application also provides a computer-readable storage medium, which may be included in the electronic device described in the above embodiments; or may be separate and not incorporated into the electronic device. The computer readable storage medium stores one or more programs which, when executed by one or more processors, perform the predictive model training or predictive methods described herein.
In summary, according to the prediction model training method, the device, the equipment and the medium provided by the embodiment of the application, the sequence value and the corresponding influence factor of the historical power load of the enterprise user in a short period are obtained, the power load characteristic and the influence factor characteristic are extracted from the obtained sequence value and the corresponding influence factor, and then the obtained power load characteristic and the influence factor characteristic are trained to obtain the prediction model, so that the trained prediction model is used for accurately predicting the short-period power load of the enterprise user, and a reliable basis is provided for the power spot market transaction.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the disclosure. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (8)

1. A method of predictive model training, the method comprising:
acquiring a sequence value of historical power load and a plurality of influence factors corresponding to the sequence value;
deleting abnormal values in the sequence values of the historical power loads, converting text information in the influence factors into numerical information, and normalizing the numerical information in the influence factors;
extracting power load characteristics and influence factor characteristics from the sequence values and the influence factors;
training a prediction model based on a Catboost algorithm by using the power load characteristics and the influence factor characteristics, wherein the prediction model is used for predicting the power load of a prediction target in the next time period;
the training of the prediction model based on the Catboost algorithm by using the power load characteristics and the influence factor characteristics comprises the following steps:
based on preset combination rules, combining the influence factor characteristics corresponding to the power load characteristics to generate a plurality of combination characteristics corresponding to each combination rule, wherein one or more combination characteristics correspond to one power load characteristic;
and training the power load characteristics and the plurality of combined characteristics by using a hyper-parameter optimization algorithm to obtain the prediction model.
2. The predictive model training method according to claim 1, wherein the combining the influence factor features corresponding to the power load features based on a preset combination rule, and the generating a plurality of combination features corresponding to each combination rule includes:
determining any two or more influence factor characteristics to be combined corresponding to the power load characteristics based on a preset combination rule;
and calculating the product of the numerical values of the two or more influence factor characteristics to be combined, and generating a plurality of combination characteristics corresponding to each combination rule.
3. The predictive model training method of any one of claims 1-2, wherein the impact factors include meteorological factors and/or time factors.
4. A method of prediction, the method comprising:
acquiring a sequence value of a historical power load of a prediction target and an influence factor corresponding to the sequence value;
inputting the sequence value and the influence factor into a prediction model trained in any one of claims 1-3, resulting in a power load for the next time period of the prediction target.
5. A predictive model training apparatus, the apparatus comprising:
the acquisition module is used for acquiring a sequence value of a historical power load and a plurality of influence factors corresponding to the sequence value;
the preprocessing module is used for deleting abnormal values in the sequence values of the historical power loads, converting text information in the influence factors into numerical value information, and normalizing the numerical value information in the influence factors;
the extraction module is used for extracting power load characteristics and influence factor characteristics from the sequence value and the influence factor;
the training module is used for training a prediction model based on a Catboost algorithm by utilizing the power load characteristics and the influence factor characteristics, and the prediction model is used for predicting the power load of a prediction target in the next time period;
the training module comprises:
the combination unit is used for combining the influence factor characteristics corresponding to the power load characteristics based on preset combination rules to generate a plurality of combination characteristics corresponding to each combination rule, wherein one or more combination characteristics correspond to one power load characteristic;
and the determining unit is used for training the power load characteristics and the plurality of combined characteristics by utilizing a hyper-parameter optimization algorithm to obtain the prediction model.
6. A prediction apparatus, characterized in that the apparatus comprises:
the system comprises an acquisition module, a prediction module and a processing module, wherein the acquisition module is used for acquiring a sequence value of historical power load of a prediction target and an influence factor corresponding to the sequence value;
a prediction module, configured to input the sequence value and the influence factor into a prediction model trained in any one of claims 1 to 3, to obtain the power load of the prediction target in the next time period.
7. A terminal device, characterized in that the terminal device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, the processor being adapted to implement the method according to any of claims 1-3 or 4 when executing the program.
8. A computer-readable storage medium, having stored thereon a computer program for implementing the method of any one of claims 1-3 or claim 4.
CN201911038724.4A 2019-10-29 2019-10-29 Prediction model training method, prediction device, prediction equipment and medium Active CN110969285B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911038724.4A CN110969285B (en) 2019-10-29 2019-10-29 Prediction model training method, prediction device, prediction equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911038724.4A CN110969285B (en) 2019-10-29 2019-10-29 Prediction model training method, prediction device, prediction equipment and medium

Publications (2)

Publication Number Publication Date
CN110969285A CN110969285A (en) 2020-04-07
CN110969285B true CN110969285B (en) 2023-04-07

Family

ID=70030029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911038724.4A Active CN110969285B (en) 2019-10-29 2019-10-29 Prediction model training method, prediction device, prediction equipment and medium

Country Status (1)

Country Link
CN (1) CN110969285B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111932269B (en) * 2020-08-11 2023-08-18 中国工商银行股份有限公司 Equipment information processing method and device
CN112199862A (en) * 2020-10-29 2021-01-08 华中科技大学 Prediction method of nano particle migration, and influence factor analysis method and system thereof
CN112614006A (en) * 2020-11-30 2021-04-06 国网北京市电力公司 Load prediction method, device, computer readable storage medium and processor
CN113762578A (en) * 2020-12-28 2021-12-07 京东城市(北京)数字科技有限公司 Training method and device of flow prediction model and electronic equipment
CN112785056B (en) * 2021-01-22 2023-04-28 杭州市电力设计院有限公司 Short-term load prediction method based on fusion of Catboost and LSTM models
CN113393120A (en) * 2021-06-11 2021-09-14 国网北京市电力公司 Method and device for determining energy consumption data
CN113537576A (en) * 2021-06-25 2021-10-22 合肥工业大学 Method and system for predicting financial predicament of listed enterprises
CN113379153A (en) * 2021-06-28 2021-09-10 北京百度网讯科技有限公司 Method for predicting power load, prediction model training method and device
CN113947201A (en) * 2021-08-02 2022-01-18 国家电投集团电站运营技术(北京)有限公司 Training method and device for power decomposition curve prediction model and storage medium
CN115689063A (en) * 2022-12-30 2023-02-03 工业富联(杭州)数据科技有限公司 Gold film thickness prediction method and device, electronic device and storage medium
CN116126827B (en) * 2023-01-04 2023-08-04 三峡高科信息技术有限责任公司 Method for simultaneously realizing modeling and visual analysis of power production data
CN116227899A (en) * 2023-05-09 2023-06-06 深圳市明源云科技有限公司 Cell house energy consumption prediction method and device, electronic equipment and readable storage medium
CN117474713A (en) * 2023-09-30 2024-01-30 国网江苏省电力有限公司信息通信分公司 Power energy consumption prediction model optimization method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107590562A (en) * 2017-09-05 2018-01-16 西安交通大学 A kind of Short-Term Load Forecasting of Electric Power System based on changeable weight combination predicted method
CN109409614A (en) * 2018-11-16 2019-03-01 国网浙江瑞安市供电有限责任公司 A kind of Methods of electric load forecasting based on BR neural network
CN110009140A (en) * 2019-03-20 2019-07-12 华中科技大学 A kind of day Methods of electric load forecasting and prediction meanss
CN110266002A (en) * 2019-06-20 2019-09-20 北京百度网讯科技有限公司 Method and apparatus for predicting electric load

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190102693A1 (en) * 2017-09-29 2019-04-04 Facebook, Inc. Optimizing parameters for machine learning models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107590562A (en) * 2017-09-05 2018-01-16 西安交通大学 A kind of Short-Term Load Forecasting of Electric Power System based on changeable weight combination predicted method
CN109409614A (en) * 2018-11-16 2019-03-01 国网浙江瑞安市供电有限责任公司 A kind of Methods of electric load forecasting based on BR neural network
CN110009140A (en) * 2019-03-20 2019-07-12 华中科技大学 A kind of day Methods of electric load forecasting and prediction meanss
CN110266002A (en) * 2019-06-20 2019-09-20 北京百度网讯科技有限公司 Method and apparatus for predicting electric load

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《基于机器学习的CatBoost模型在预测重症手足口病中的应用》;王斌 等;《中国感染控制杂志》;20190131;第18卷(第1期);正文第1-3节,图1,表1 *

Also Published As

Publication number Publication date
CN110969285A (en) 2020-04-07

Similar Documents

Publication Publication Date Title
CN110969285B (en) Prediction model training method, prediction device, prediction equipment and medium
CN110610280B (en) Short-term prediction method, model, device and system for power load
JP6895416B2 (en) Energy demand forecasting system and energy demand forecasting method
CN111210093B (en) Daily water consumption prediction method based on big data
CN109961248B (en) Method, device, equipment and storage medium for predicting waybill complaints
CN108171379B (en) Power load prediction method
CN111695731B (en) Load prediction method, system and equipment based on multi-source data and hybrid neural network
CN113837488B (en) Method, system and equipment for predicting energy consumption data
CN110689190A (en) Power grid load prediction method and device and related equipment
CN111402017A (en) Credit scoring method and system based on big data
CN110930179A (en) Task evaluation method, system, device and computer readable storage medium
CN111178585A (en) Fault reporting amount prediction method based on multi-algorithm model fusion
CN114943565A (en) Electric power spot price prediction method and device based on intelligent algorithm
CN117196695B (en) Target product sales data prediction method and device
CN113988398A (en) Wind turbine generator power prediction method and device, electronic equipment and storage medium
Bianchini et al. Estimation of photovoltaic generation forecasting models using limited information
JP7231504B2 (en) Meteorological Numerical Analysis System, Prediction Target Data Generation System, and Meteorological Numerical Analysis Method
CN111950752A (en) Photovoltaic power station generating capacity prediction method, device and system and storage medium thereof
Potapov et al. Short-Term Forecast of Electricity Load for LLC" Omsk Energy Retail Company" Using Neural Network
CN116757465A (en) Line risk assessment method and device based on double training weight distribution model
CN110826196A (en) Industrial equipment operation data processing method and device
CN115860797A (en) Electric quantity demand prediction method suitable for new electricity price reform situation
CN114971736A (en) Power metering material demand prediction method and device, electronic equipment and storage medium
CN114282657A (en) Market data long-term prediction model training method, device, equipment and storage medium
CN111105148B (en) Off-job probability evaluation method, apparatus and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant