CN111667117A - Method for supplementing missing value by applying Bayesian estimation in power load prediction - Google Patents

Method for supplementing missing value by applying Bayesian estimation in power load prediction Download PDF

Info

Publication number
CN111667117A
CN111667117A CN202010521260.9A CN202010521260A CN111667117A CN 111667117 A CN111667117 A CN 111667117A CN 202010521260 A CN202010521260 A CN 202010521260A CN 111667117 A CN111667117 A CN 111667117A
Authority
CN
China
Prior art keywords
data
power load
theta
data set
bayesian estimation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010521260.9A
Other languages
Chinese (zh)
Inventor
周浩
胡炳谦
顾一峰
韩俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ieslab Energy Technology Co ltd
Original Assignee
Shanghai Ieslab Energy Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Ieslab Energy Technology Co ltd filed Critical Shanghai Ieslab Energy Technology Co ltd
Priority to CN202010521260.9A priority Critical patent/CN111667117A/en
Publication of CN111667117A publication Critical patent/CN111667117A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Supply And Distribution Of Alternating Current (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

In power load prediction, the historical load data of a given unit is usually the important basic data for performing prediction calculation and simulation, but it is the principle that the data set of the power load data contains missing values due to various reasons (such as data loss caused by an emergency, etc.), and these missing values are usually left as blank or marked as placeholders. When the power load prediction model trains a data set containing many missing values, the presence of the missing values may greatly affect the performance of the machine learning model. The invention provides a method for supplementing missing data by applying Bayesian estimation to missing values in historical load data in power load prediction, which is characterized in that a Bayesian estimation method is used for calculating maximum likelihood to supplement the missing values in a power historical load data set.

Description

Method for supplementing missing value by applying Bayesian estimation in power load prediction
Technical Field
The invention relates to the technical field of power load prediction, in particular to a method for supplementing missing data by applying Bayesian estimation to missing values in historical load data in power load prediction.
Background
The accurate prediction of the power load is an important basis for ensuring the safety and economic operation of a power system and realizing scientific management and scheduling of a power grid, and is also a core component of a power energy management system. In power load prediction, historical load data of a given unit is usually important basic data for prediction calculation and simulation. However, it is understood that the data set of the power load data may contain missing values for various reasons (e.g., data loss due to an emergency, etc.), and these missing values are usually left blank or marked as placeholders. When the power load prediction model trains a data set containing many missing values, the presence of the missing values may greatly affect the performance of the machine learning model. Some algorithms in power load prediction assume that all values are numerical and inclusive. One way to deal with this problem is to delete individual data pairs that contain missing values, but this runs the risk of losing valuable information. Another preferred strategy is to interpolate missing values, i.e. to infer the size of the missing value from the observed data. The invention discloses a method for applying Bayesian estimation to missing values of historical load data in power load prediction to supplement missing data, and achieves the purpose of ensuring complete and effective operation prediction of a power load prediction model.
Disclosure of Invention
The invention provides a method for supplementing missing data by applying Bayesian estimation to missing values in historical load data in power load prediction, which is characterized in that a Bayesian estimation method is used for calculating maximum likelihood to supplement the missing values in a power historical load data set.
Bayesian estimation is a method for determining parameters of a model in statistics, and it is considered that each parameter in a data set obeys a certain probability distribution, and existing data is generated only under the distribution of the parameter. Therefore, in the intuitive understanding, a parameter theta is assumed, then the theta is solved according to data, wherein the probability p (theta) of theta occurrence needs to be set artificially, and then a specific theta is solved by combining a MAP (maximum degree posterior) method. Under the condition of small data quantity or sparse Bayesian estimation, the accuracy is improved by considering prior, and the estimated parameters can better reflect the actual situation. The application of Bayesian estimation in the invention is to fit missing data in power load prediction in the distribution of the whole data set to find the maximum likelihood number, fill up the null value, ensure the integrity of the data, and further ensure the model operation effect of the power load prediction, and the original data set before filling up the null value and the data set after filling up the null value are subjected to one-way-ANOVA (one way-ANOVA) to calculate the significance difference value between two groups of data, and it is required to ensure that no significance difference exists between the two groups of data. If significant difference exists after the two groups of data are verified, the selection of specific restrictive parameters in the Bayesian estimation model needs to be adjusted, or missing values are still eliminated to ensure that the filled data and the original data do not have significant difference, and the whole data set can keep certain effectiveness; the actually collected power load historical data or the data set subjected to outlier removal/denoising processing does not have missing values through Bayesian estimation, and the effectiveness of the whole data set can be improved. The filled data set is used for a power load prediction model, so that the reliability and the accuracy of power load prediction are greatly improved, and the implementation flow diagram of the method is shown in fig. 1.
Drawings
Fig. 1 is a schematic diagram illustrating a processing flow of supplementing missing values by using bayesian estimation according to an embodiment of the present invention.
Detailed Description
In order to make the content, the purpose, the features and the advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the scope of the protection scope of the present invention.
Step one, data preprocessing: arranging the collected original data according to a time sequence, determining the start and stop time of the data set, checking the default of the data on the time sequence, marking the default value and recording the default start and stop time.
Step two, Bayesian estimation supplement missing value: and (3) marking the historical power load data preprocessed in the first step with a timestamp, and then performing Bayesian estimation operation to supplement the consistency of the power load data on a time sequence without corresponding data in certain time periods. The calculation method specifically adopted is as follows:
1. determining a prior distribution function P (theta) of an uncertainty parameter theta through the distribution form of the data set;
2. d = { x from whole data set1,x2, …,xnSolving a joint distribution function P (D | theta) of samples, which is a function for theta;
3. and (3) solving the posterior distribution of theta by using a Bayesian formula:
Figure 323910DEST_PATH_IMAGE001
4. and (3) solving a Bayesian estimation value:
Figure 512184DEST_PATH_IMAGE002
wherein
Figure 475592DEST_PATH_IMAGE003
The maximum likelihood number for the calculation target is used to supplement the missing value. The prior distribution function P (theta) and the joint distribution function P (D | theta) of the samples in the calculation method are obtained by fitting Gaussian distribution to a data set, and the preset condition is that the data set meets normal Gaussian distribution on the whole distribution.
Step three, data validity verification: the original power load data set and the data set after the supplementary data processing need to be checked for statistical difference of data validity to ensure the validity of the data. Two sets of data were subjected to one-way-ANOVA (one way-ANOVA) to calculate the significant difference between the two sets of data, which was required to ensure that there was no significant difference between the two sets of data. If the two groups of data have significance difference after verification, the selection of specific parameters in the second step needs to be adjusted, the number of the original data with great difference values eliminated is reduced, the degree of denoising processing is reduced to ensure that the processed data and the original data have no significance difference, and the processed data keeps effectiveness.
The invention provides a method for supplementing missing values caused by various reasons in historical data of power load prediction by applying a Bayesian estimation method, which is characterized in that the Bayesian estimation method is introduced in the power load prediction data preprocessing, and the fitting numerical value with the maximum probability is selected to supplement the missing values, so that the accuracy of the historical data for power load prediction is higher, and the prediction effect of the power load is obviously improved. By applying the method, the data set of the power load prediction model is more complete, so that the training effect of the prediction model is better, and the prediction accuracy is greatly improved.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (1)

1. The invention discloses a method for supplementing missing values by applying Bayesian estimation in power load prediction, which is characterized by comprising the following steps of: step one, data preprocessing: arranging the collected original data according to a time sequence, determining the start-stop time of the data set, checking the default of the data on the time sequence, marking a default value and recording the default start-stop time;
step two, Bayesian estimation supplement missing value: marking the historical power load data preprocessed in the first step with a timestamp, and then performing Bayesian estimation operation to supplement the consistency of the power load data on a time sequence without corresponding data in certain time periods;
the calculation method specifically adopted is as follows:
1) determining a prior distribution function P (theta) of an uncertainty parameter theta through the distribution form of the data set;
2) d = { x from whole data set1,x2, …,xnSolving a joint distribution function P (D | theta) of samples, which is a function for theta;
3) and (3) solving the posterior distribution of theta by using a Bayesian formula:
Figure 706570DEST_PATH_IMAGE001
4) and (3) solving a Bayesian estimation value:
Figure 955542DEST_PATH_IMAGE002
wherein
Figure 293114DEST_PATH_IMAGE003
The maximum likelihood number for calculating the target, for supplementing missing values,
the prior distribution function P (theta) and the joint distribution function P (D | theta) of the sample in the calculation method are obtained by fitting Gaussian distribution to a data set, and the preset condition is that the data set meets normal Gaussian distribution on the whole distribution;
step three, data validity verification: the original power load data set and the data set after being processed by the supplementary data need to be checked for statistical difference of data validity to ensure the validity of the data, two groups of data need to be subjected to one way-ANOVA (one way-ANOVA), a significant difference value between the two groups of data is calculated, no significant difference between the two groups of data needs to be ensured,
if the two groups of data have significance difference after verification, the selection of specific parameters in the second step needs to be adjusted, the number of the original data with great difference values eliminated is reduced, the degree of denoising processing is reduced to ensure that the processed data and the original data have no significance difference, and the processed data keeps effectiveness.
CN202010521260.9A 2020-06-10 2020-06-10 Method for supplementing missing value by applying Bayesian estimation in power load prediction Pending CN111667117A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010521260.9A CN111667117A (en) 2020-06-10 2020-06-10 Method for supplementing missing value by applying Bayesian estimation in power load prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010521260.9A CN111667117A (en) 2020-06-10 2020-06-10 Method for supplementing missing value by applying Bayesian estimation in power load prediction

Publications (1)

Publication Number Publication Date
CN111667117A true CN111667117A (en) 2020-09-15

Family

ID=72386187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010521260.9A Pending CN111667117A (en) 2020-06-10 2020-06-10 Method for supplementing missing value by applying Bayesian estimation in power load prediction

Country Status (1)

Country Link
CN (1) CN111667117A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117785855A (en) * 2023-12-25 2024-03-29 杭州字节方舟科技有限公司 Block chain-based wind control early warning, device, equipment and storage medium
CN117932474A (en) * 2024-03-22 2024-04-26 山东核电有限公司 Training method, device, equipment and storage medium of communication missing data determination model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964998A (en) * 2009-07-24 2011-02-02 北京亿阳信通软件研究院有限公司 Forecasting method and device of telephone traffic in ordinary holiday of telecommunication network
CN104008433A (en) * 2014-06-03 2014-08-27 国家电网公司 Method for predicting medium-and-long-term power loads on basis of Bayes dynamic model
CN107577649A (en) * 2017-09-26 2018-01-12 广州供电局有限公司 The interpolation processing method and device of missing data
CN108320063A (en) * 2018-03-26 2018-07-24 上海积成能源科技有限公司 To the method for rejecting abnormal data and denoising in a kind of load forecast
US20200082283A1 (en) * 2018-09-12 2020-03-12 Samsung Sds Co., Ltd. Method and apparatus for correcting missing value in data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964998A (en) * 2009-07-24 2011-02-02 北京亿阳信通软件研究院有限公司 Forecasting method and device of telephone traffic in ordinary holiday of telecommunication network
CN104008433A (en) * 2014-06-03 2014-08-27 国家电网公司 Method for predicting medium-and-long-term power loads on basis of Bayes dynamic model
CN107577649A (en) * 2017-09-26 2018-01-12 广州供电局有限公司 The interpolation processing method and device of missing data
CN108320063A (en) * 2018-03-26 2018-07-24 上海积成能源科技有限公司 To the method for rejecting abnormal data and denoising in a kind of load forecast
US20200082283A1 (en) * 2018-09-12 2020-03-12 Samsung Sds Co., Ltd. Method and apparatus for correcting missing value in data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117785855A (en) * 2023-12-25 2024-03-29 杭州字节方舟科技有限公司 Block chain-based wind control early warning, device, equipment and storage medium
CN117932474A (en) * 2024-03-22 2024-04-26 山东核电有限公司 Training method, device, equipment and storage medium of communication missing data determination model

Similar Documents

Publication Publication Date Title
CN113965359B (en) Federal learning data poisoning attack-oriented defense method and device
CN110942154A (en) Data processing method, device, equipment and storage medium based on federal learning
CN111667117A (en) Method for supplementing missing value by applying Bayesian estimation in power load prediction
CN108062573A (en) Model training method and device
CN111860980A (en) Method for interpolating and supplementing missing value by applying classification regression tree in power load prediction
CN112558875B (en) Data verification method and device, electronic equipment and storage medium
US20230155378A1 (en) Reliability calculation method of power distribution system considering hierarchical decentralized control of demand-side resources
CN111784173B (en) AB experiment data processing method, device, server and medium
CN114911788B (en) Data interpolation method and device and storage medium
CN117032954B (en) Memory optimization method, system, equipment and medium for terminal training model
CN111667123A (en) Method for supplementing missing value by applying multiple interpolation in power load prediction
CN109784484A (en) Neural network accelerated method, device, neural network accelerate chip and storage medium
CN107492303A (en) Drawing method and system for equivalent salt deposit density distribution map of power transmission line in coastal region
CN106294115A (en) The method of testing of a kind of application system animal migration and device
CN107945034A (en) Financial analysis method, application server and computer-readable recording medium based on microblogging finance and economics event
CN112488843A (en) Enterprise risk early warning method, device, equipment and medium based on social network
CN108804640B (en) Data grouping method, device, storage medium and equipment based on maximized IV
CN111966676A (en) Method for supplementing missing value by applying Bayesian estimation in residential electricity consumption data mining
CN115713395A (en) Flink-based user wind control management method, device and equipment
CN111768045A (en) Method for supplementing resident electricity consumption missing data by applying multiple interpolation in resident electricity consumption management
CN111428886A (en) Fault diagnosis deep learning model self-adaptive updating method and device
CN114679466B (en) Consensus processing method, device, computer equipment and medium for block chain network
CN103106103A (en) Requesting information classification method and device
CN108805778A (en) Electronic device, the method and storage medium for acquiring collage-credit data
CN111641704B (en) Resource-related data transmission method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200915