CN111667123A - Method for supplementing missing value by applying multiple interpolation in power load prediction - Google Patents

Method for supplementing missing value by applying multiple interpolation in power load prediction Download PDF

Info

Publication number
CN111667123A
CN111667123A CN202010555226.3A CN202010555226A CN111667123A CN 111667123 A CN111667123 A CN 111667123A CN 202010555226 A CN202010555226 A CN 202010555226A CN 111667123 A CN111667123 A CN 111667123A
Authority
CN
China
Prior art keywords
data
power load
missing
data set
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010555226.3A
Other languages
Chinese (zh)
Inventor
周浩
顾一峰
胡炳谦
韩俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Ieslab Energy Technology Co ltd
Original Assignee
Shanghai Ieslab Energy Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Ieslab Energy Technology Co ltd filed Critical Shanghai Ieslab Energy Technology Co ltd
Priority to CN202010555226.3A priority Critical patent/CN111667123A/en
Publication of CN111667123A publication Critical patent/CN111667123A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

In the power load prediction model, accurate and effective power historical load data are very important, play an important role in power prediction data analysis and model calculation, and when the abnormal values are brought into the power load prediction model or mathematical analysis, the accuracy and simulation effect of power load prediction can be greatly reduced, and the abnormal values need to be analyzed and eliminated through a mathematical statistical method. In addition, many missing values may be caused by various uncontrollable reasons during power operation, which may cause the imperfection of the historical power load data set. The missing values can be supplemented through a reasonable and effective method, and the supplemented data set is an important part for guaranteeing the accurate prediction of the power load prediction model. The invention provides a method for supplementing missing data by applying a Multiple Interpolation (MICE) method to missing values in historical power load data, and the missing data is supplemented to ensure the integrity of the historical power load data.

Description

Method for supplementing missing value by applying multiple interpolation in power load prediction
Technical Field
The invention relates to the technical field of power load prediction, in particular to a method for supplementing missing data by applying a Multiple Interpolation (MICE) method to missing values in historical load data in power load prediction.
Background
The accurate prediction of the power load is an important basis for ensuring the safety and economic operation of a power system and realizing scientific management and scheduling of a power grid, and is also a core component of a power energy management system. Some algorithms in power load prediction assume that all values are numerical and inclusive. However, in practical applications, there are data missing situations caused by various reasons in the power grid operation process, such as removing abnormal values, and missing in time series caused by accidents in operation. One way to deal with this data loss problem is to directly delete individual data pairs that contain missing values, but this runs the risk of losing valuable information. Another preferred strategy is to interpolate to supplement the missing value, that is, to estimate the size of the missing value from the observed data, so that the integrity of the data can be maintained to the maximum extent, and the subsequent power prediction model can have more accurate input data, thereby giving a more accurate output prediction value. The invention discloses a method for supplementing missing data by applying Multiple Interpolation (MICE) to missing values of historical load data in power load prediction, which achieves the aim of completeness and effectiveness of a historical load power data set and further ensures the effectiveness and accuracy of a power load prediction model.
Disclosure of Invention
The invention provides a method for supplementing and restoring missing values or abnormal eliminated data of power load data, which is characterized in that a Multiple Interpolation (MICE) method is applied, and the method comprises three functional modules of missing value identification, MICE interpolation supplement and missing value filling verification.
Multiple Interpolation (MICE) is a method of dealing with missing values based on repetitive simulations. When faced with the complex missing value problem, it will generate a complete set of data sets (typically 3 to 10) from one data set containing the missing values. Missing data will be filled in with the Monte carlo method in each simulation dataset. The implementation of multiple interpolation is shown in fig. 1, where the function mic () first starts with a data box containing missing data and then returns an object containing multiple (default to 5) complete data sets. Each complete data set is generated by interpolating missing data in the original data frame. Each complete data set is slightly different because of the random components of the interpolation. Then, the with () function may apply a statistical model (e.g., a linear fitting model LR () or a Generalized Linear Model (GLM)) to each complete data set in turn. Finally, the pool () function integrates these individual analysis results into a set of results. Both the standard error and the p-value of the final model will accurately reflect the uncertainty due to missing values and multiple interpolations. The with function generally includes a plurality of regression models for interpolating data sets, and a T test is performed on the data sets to determine whether a data set obtained by one of the linear models is qualified. The pool function summarizes a plurality of regression models and performs an F test on the whole data set to determine whether the whole method is qualified. The qualified data can be output as a padded data set. The threshold values for T-test and F-test need to be determined by the data quality control requirements.
The original data set before the missing value is filled and the data set after the missing value is filled are subjected to one-way-ANOVA (one way-ANOVA), and the significance difference value between the two groups of data is calculated, so that no significance difference exists between the two groups of data. If the two groups of data have significance difference after verification, the number of the data sets of the with function needs to be adjusted, or missing values are still removed to ensure that the filled data and the original data have no significance difference, and the whole data set can keep certain validity.
The actually collected power load historical data is processed by the modules, so that the effect of complementing the integrity of the data set can be achieved, and the effectiveness of the original data is improved. The historical data of the power load after the filling processing is used for a power load prediction model, so that the reliability and the accuracy of power load prediction are greatly improved.
Drawings
Fig. 1 is a schematic diagram of a multiple interpolation model according to an embodiment of the present invention.
Fig. 2 is a schematic processing flow diagram of a method for supplementing missing values of historical load data according to an embodiment of the present invention.
Detailed Description
In order to make the content, the purpose, the features and the advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the scope of the protection scope of the present invention.
As shown in fig. 2, the method for supplementing the missing value of the power load by applying KNN interpolation proposed by the present invention is specifically divided into the following steps.
The first step,Data preprocessing:arranging the collected historical data of the original historical power load according to a time sequence, determining the start and stop time of a data set, checking the default of the data on the time sequence, marking a default value and recording the default start and stop time.
Step two,Interpolation of supplementary data by multiple interpolation algorithm:the multiple interpolation algorithm supplements data by performing a distribution operation on the labeled data, and generally includes the following parts:
the Mice function first starts with a data set containing missing data and returns an object containing multiple (by default, 5) complete data sets. Each complete data set is generated by interpolating missing data in the original data frame. Each complete data set is slightly different due to the random components interpolated;
the with function may apply a statistical model (e.g., a linear model or a generalized linear model) to each complete data set in turn;
pool function integrates these individual analysis results into one set of results. Both the standard error and the p-value of the final model will accurately reflect the uncertainty due to missing values and multiple interpolations.
Step three,Data validity verification: the original power load historical data set and the data set supplemented by the KNN algorithm need to be checked for data validity statistical differences to ensure the validity of the data. Two sets of data were subjected to one way-ANOVA (one way-ANOVA) to calculate significant differences between the two sets of dataValue, it is necessary to ensure that there is no significant difference between the two sets of data. If significant difference exists after two groups of data are verified, the value of k (the number of nearest neighbors) needs to be adjusted or a distance measurement mode needs to be changed, the operation mode of supplement value is improved, the dimension of filling processing is changed to ensure that the processed data does not have significant difference with the original data, and the accuracy and the effectiveness of the processed data are kept.
The invention provides a method for supplementing values or missing values in historical data of power load prediction due to various reasons by using a multiple interpolation algorithm model, which is characterized in that multiple interpolation algorithms are introduced in power load prediction data processing to supplement the missing values, and the number of linear fitting/linear programming models in a with function is adjusted by comparing validity verification of data sets before and after comparison, so that the historical load data for power load prediction is more complete, and the prediction effect of a power load model is obviously improved.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (1)

1. The invention discloses a method for supplementing missing values by applying multiple interpolation in power load prediction, which is characterized by comprising the following steps of: the method for supplementing the power load missing value by applying the KNN interpolation specifically comprises the following steps:
the first step,Data preprocessing:arranging the collected historical data of the original historical power load according to a time sequence, determining the start-stop time of a data set, checking the default of the data on the time sequence, marking a default value and recording the default start-stop time;
step two,Interpolation of supplementary data by multiple interpolation algorithm:the multiple interpolation algorithm supplements data by performing a distribution operation on the labeled data, and generally includes the following parts:
1) the Mice function firstly starts from a data set containing missing data and returns an object containing a plurality of (default 5) complete data sets;
each complete data set is generated by interpolating missing data in the original data frame;
each complete data set is slightly different due to the random components interpolated;
2) the with function may apply a statistical model (e.g., a linear model or a generalized linear model) to each complete data set in turn;
3) the pool function integrates these individual analysis results into a set of results;
the standard error and the p value of the final model accurately reflect the uncertainty generated by the missing value and multiple interpolation;
step three,Data validity verification: the original power load historical data set and the data set supplemented by the KNN algorithm need to be checked for data validity statistical differences to ensure data validity, two groups of data need to be subjected to one-way-ANOVA (one way-ANOVA), significance difference values between the two groups of data are calculated, no significance difference between the two groups of data needs to be ensured, if the two groups of data are verified, the value of k (the number of nearest neighbors) needs to be adjusted or a distance measurement mode needs to be changed, the operation mode of supplementing and recharging is improved, the dimension of filling processing is changed to ensure that the processed data and the original data do not have significance differences, and the accuracy and the validity of the processed data are kept.
CN202010555226.3A 2020-06-17 2020-06-17 Method for supplementing missing value by applying multiple interpolation in power load prediction Pending CN111667123A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010555226.3A CN111667123A (en) 2020-06-17 2020-06-17 Method for supplementing missing value by applying multiple interpolation in power load prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010555226.3A CN111667123A (en) 2020-06-17 2020-06-17 Method for supplementing missing value by applying multiple interpolation in power load prediction

Publications (1)

Publication Number Publication Date
CN111667123A true CN111667123A (en) 2020-09-15

Family

ID=72388471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010555226.3A Pending CN111667123A (en) 2020-06-17 2020-06-17 Method for supplementing missing value by applying multiple interpolation in power load prediction

Country Status (1)

Country Link
CN (1) CN111667123A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453696A (en) * 2023-12-07 2024-01-26 深圳拓安信物联股份有限公司 Method and device for supplementing missing data of water meter

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964998A (en) * 2009-07-24 2011-02-02 北京亿阳信通软件研究院有限公司 Forecasting method and device of telephone traffic in ordinary holiday of telecommunication network
CN108519989A (en) * 2018-02-27 2018-09-11 国网冀北电力有限公司电力科学研究院 The reduction retroactive method and device of a kind of day electricity missing data
US20190180389A1 (en) * 2016-08-01 2019-06-13 Liverpool John Moores University Analysing energy/utility usage
CN110580542A (en) * 2019-07-31 2019-12-17 中国电力科学研究院有限公司 Power consumption prediction method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964998A (en) * 2009-07-24 2011-02-02 北京亿阳信通软件研究院有限公司 Forecasting method and device of telephone traffic in ordinary holiday of telecommunication network
US20190180389A1 (en) * 2016-08-01 2019-06-13 Liverpool John Moores University Analysing energy/utility usage
CN108519989A (en) * 2018-02-27 2018-09-11 国网冀北电力有限公司电力科学研究院 The reduction retroactive method and device of a kind of day electricity missing data
CN110580542A (en) * 2019-07-31 2019-12-17 中国电力科学研究院有限公司 Power consumption prediction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453696A (en) * 2023-12-07 2024-01-26 深圳拓安信物联股份有限公司 Method and device for supplementing missing data of water meter
CN117453696B (en) * 2023-12-07 2024-04-12 深圳拓安信物联股份有限公司 Method and device for supplementing missing data of water meter

Similar Documents

Publication Publication Date Title
CN111860980A (en) Method for interpolating and supplementing missing value by applying classification regression tree in power load prediction
CN113867295A (en) Manufacturing workshop AGV dynamic scheduling method, system, equipment and storage medium based on digital twinning
CN111667123A (en) Method for supplementing missing value by applying multiple interpolation in power load prediction
CN109389294B (en) Usability evaluation method and device of nuclear security level DCS (distributed control System)
US8938484B2 (en) Maintaining dependencies among supernodes during repeated matrix factorizations
CN109861293B (en) Method for evaluating influence of photovoltaic uncertainty on small signal stability of power system
CN111258585A (en) Attendance calculation method, system and equipment
CN109144806B (en) Function verification method and device for register transmission stage circuit
CN111667117A (en) Method for supplementing missing value by applying Bayesian estimation in power load prediction
Nijhawan et al. On development of change point based generalized SRGM for software with multiple releases
CN113821419A (en) Cloud server aging prediction method based on SVR and Gaussian function
CN111476408B (en) Power communication equipment state prediction method and system
Mirnajafizadeh et al. Robust simultaneous lot-sizing and scheduling with considering controllable processing time and fixed carbon emission in flow-shop environment
CN112861064A (en) Social credit evaluation source data processing method, system, terminal and medium
CN111768045A (en) Method for supplementing resident electricity consumption missing data by applying multiple interpolation in resident electricity consumption management
Bhatti et al. Profit Analysis to an Industrial System Possessing Active Redundancy form Using Geometric Distribution
CN111291464A (en) Dynamic equivalence method and device for power system
CN111222673A (en) Section out-of-limit positioning method and system in electric quantity transaction plan
CN111966676A (en) Method for supplementing missing value by applying Bayesian estimation in residential electricity consumption data mining
CN115601198B (en) Power data simulation method, device, equipment and storage medium
CN112365070B (en) Power load prediction method, device, equipment and readable storage medium
CN117033113B (en) Control circuit and method for signal delay
CN117609270B (en) Multi-dimensional data distributed parallel processing method
CN116257510A (en) Ammeter data verification and repair method, device and storage medium
CN115730245A (en) Fault diagnosis method, device, equipment and medium for oil-immersed power transformer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination