CN115081840A - Daily electric quantity abnormal value detection and correction system based on first-order difference and ARIMA method - Google Patents

Daily electric quantity abnormal value detection and correction system based on first-order difference and ARIMA method Download PDF

Info

Publication number
CN115081840A
CN115081840A CN202210650951.8A CN202210650951A CN115081840A CN 115081840 A CN115081840 A CN 115081840A CN 202210650951 A CN202210650951 A CN 202210650951A CN 115081840 A CN115081840 A CN 115081840A
Authority
CN
China
Prior art keywords
abnormal
point
value
data
order difference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210650951.8A
Other languages
Chinese (zh)
Other versions
CN115081840B (en
Inventor
涂钊颖
文明
唐敬军
肖振锋
唐军
廖菁
李文英
冯晨爽
曾艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingshi Wanfang Information Technology Co ltd
State Grid Corp of China SGCC
State Grid Hunan Electric Power Co Ltd
Economic and Technological Research Institute of State Grid Hunan Electric Power Co Ltd
Original Assignee
Beijing Jingshi Wanfang Information Technology Co ltd
State Grid Corp of China SGCC
State Grid Hunan Electric Power Co Ltd
Economic and Technological Research Institute of State Grid Hunan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingshi Wanfang Information Technology Co ltd, State Grid Corp of China SGCC, State Grid Hunan Electric Power Co Ltd, Economic and Technological Research Institute of State Grid Hunan Electric Power Co Ltd filed Critical Beijing Jingshi Wanfang Information Technology Co ltd
Priority to CN202210650951.8A priority Critical patent/CN115081840B/en
Publication of CN115081840A publication Critical patent/CN115081840A/en
Application granted granted Critical
Publication of CN115081840B publication Critical patent/CN115081840B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00002Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by monitoring
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00006Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/80Management or planning
    • Y02P90/82Energy audits or management systems therefor

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Educational Administration (AREA)
  • Power Engineering (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Locating Faults (AREA)

Abstract

The invention relates to a detection and correction system for abnormal value of daily electricity quantity based on first-order difference and ARIMA method, comprising: sequencing the historical original time sequence daily electric quantity data according to the time sequence; aiming at the sorted daily electric quantity data, taking the mean value of front and back 5-bit non-empty numerical values as a preliminary fitting value for the missing value to obtain new time sequence data; respectively carrying out data abnormal value detection through a first-order difference model and an ARIMA model, and identifying abnormal points of the whole time sequence by combining judgment; and processing the abnormal points to obtain a final complete time sequence data set. The invention adopts a method of combining the first-order difference and the ARIMA model to detect and correct the abnormal value, and can greatly increase the correction accuracy of the abnormal value, thereby better serving the analysis and the fine operation of the internal power supply and demand situation of the power company.

Description

Daily electric quantity abnormal value detection and correction system based on first-order difference and ARIMA method
Technical Field
The invention relates to a system for detecting and correcting abnormal daily electricity values based on a first-order difference method and an ARIMA method, and belongs to the field of power consumption data analysis in the power industry.
Background
An abnormal value is an individual value that deviates far from other values and does not comply with statistical rules, among a series of measured values repeatedly measured for a certain measurement, due to unexpected condition changes or environmental disturbances, artificial reading or recording errors, and the like. In the field of electric power, the phenomenon is particularly common, and factors such as equipment faults, transmission errors and statistical deviation cause that electric power data at a certain time point is deviated from a practical situation, so that an electric power company cannot accurately control the electric power development situation, and the economic research and judgment work of related government departments also causes certain interference. The existing technical method for removing the abnormal value mainly comprises EEMD _ neural network and cloud computing, the daily electric quantity has obvious seasonal characteristics, the electric property of different industries is different, and the abnormality cannot be identified by a single standard or a full-time-domain standard. Therefore, the method has low reusability in the power industry.
Based on the factors, how to accurately detect the abnormal point of the specific time series data and finish the correction of the abnormal value is significant to the daily work of the power related department. Therefore, it is necessary to use a system for detecting and correcting abnormal values of daily electricity based on first-order difference and ARIMA method to automatically pass through the characteristics of adjacent data before and after a certain time point, and to comprehensively detect and correct the abnormal values by combining the characteristics of the full time domain and the industry attributes. The accuracy of abnormal value correction can be greatly increased by combining the first-order difference and the ARIMA model, so that the analysis of the internal power supply and demand situation of the power company and the fine operation can be better served.
Disclosure of Invention
The invention aims to provide a daily electric quantity abnormal value detection and correction system based on a first-order difference and an ARIMA method, which is used for positioning the positions of abnormal points based on various caliber daily electric quantity data of a time sequence so as to establish an abnormal value removing mechanism, fully cleaning electric quantity data, truly reflecting the internal meaning of the electric power data and better judging the social and economic development trend through an electric eye visual angle.
A detection and correction system for abnormal daily electricity quantity based on first-order difference and ARIMA method comprises the following steps:
step 1, sorting the daily electric quantity data of a historical original time sequence according to a time sequence, and marking a data position p corresponding to a missing value of the daily electric quantity data;
and 2, for the sorted daily electric quantity data, taking the numerical value corresponding to the p point as a preliminary fitting value by using the average value of front and back 5-bit non-empty numerical values to obtain an initially sorted historical data X data set, and recording as X ═ X 1 ,X 2 ,X 3 .......X n };
Step 3, carrying out data abnormal value detection on the sorted initial historical data through a first-order difference model, namely a method a, and respectively using A to identify data positions corresponding to abnormal values i Instead, the 'safety point' average value method is adopted to carry out the first correction on the abnormal value point, and after the correction is carried outIs determined from the complete data set YA ═ YA 1 ,YA 2 ,YA 3 .......YA n };
Step 4, carrying out data abnormal value detection on the sorted initial historical data through an ARIMA model, namely a method B, and respectively using a B to identify data positions corresponding to the abnormal values i Instead, the ARIMA method is adopted to carry out second correction on the abnormal value point, and the corrected complete data set YB is equal to { YB } 1 ,YB 2 ,YB 3 .......YB n };
Step 5, fitting result YA i And YB i Performing final abnormal value detection, combining the above results, determining the data position of the final abnormal value, and marking as AB i
Step 6, for the above-mentioned abnormal point AB that confirms i Marking the corresponding numerical value as a null value, and performing data fitting on the null value by using an ARIMA model to obtain a final complete time sequence data set Y ═ Y 1 ,Y 2 ,Y 3 .......Y n }。
Step 3, detecting and first correcting the abnormal points of the initially sorted historical data through a first-order difference model, and the specific steps are as follows:
step 301, calculate the standard deviation of the X data set, i.e.
Figure BDA0003686103650000021
Step 302, calculating a First order Difference vector First Difference of X, hereinafter referred to as FD, that is, XFD j =X i -X i-1 Then the first order difference data set XFD corresponding thereto is { XFD ═ XFD 1 ,XFD 2 ,XFD 3 .......XFD n };
Step 303, based on steps 301 and 302, the specific steps of correcting the data of the outliers detected by the first-order difference model are as follows:
step 303A, in the process of detecting the first-order difference model abnormal value of the XFD dataset, if the first-order difference of the ith point is greater than m times of the standard deviation, that is, XFD i >m*σ X Wherein m is positive which can be decided by userAnd (4) counting. Theoretically, the larger the value of m, the greater the user's acceptance of the data deviation. Judging that the point is an abnormal point detected based on a first-order difference model, namely recording the judgment result as 1; otherwise, if the abnormal point is the abnormal point, the judgment result is marked as 0;
step 303B, based on step 303A, in case that the determination result is 1, performing numerical correction on the detected abnormal point by using a "safety point" averaging method, that is, performing numerical correction on the detected abnormal point
Figure BDA0003686103650000031
Wherein the point i is an abnormal point, k is the first point before the point i is not an abnormal point, and j is the first point after the point i is not an abnormal point; in the case where the judgment result is 0, the original value, namely YA, is used i =X i
Step 303C, based on the result of step 303B, executing steps 303A and 303B in a circulating manner until all the results of detecting abnormal points are 0, and terminating the circulating process;
step 304, based on step 303, obtains the first-order difference model modified complete data set YA ═ { YA ═ 1 ,YA 2 ,YA 3 .......YA n }。
And 4, detecting and correcting the abnormal points of the initially sorted historical data for the second time through an ARIMA model, and specifically comprising the following steps:
step 401, establishing an ARIMA fitting model as
Figure RE-GDA0003788784840000032
Explained variable
Figure RE-GDA0003788784840000033
Wherein X i Influenced by the value of d, when d is 0, X t Is the original sequence data; when d is n, X t Is the corresponding n-order difference data sequence;
step 402, determining the best fit ARIMA model according to Bayesian information criterion BIC, which comprises the following steps:
step 402A, limiting the value ranges of p, q and d to be 0,2, except the case that p, q and d are 0;
step 402B, based on the p, q and d parameter values selected in step 402A, performs ARIMA regression model, resulting in p, q and d combination based parameter set pqd ═ { pqd ═ 1 ,pqd 2 ,pqd 3 .......pqd 26 };
Step 402C, based on the step 402B, calculating a BIC value, where BIC is ln (n) k-2ln L, where L is a maximum likelihood function value corresponding to the current dimension model, and n and k are a sample size and a parameter number, respectively, to obtain BIC ═ { BIC ═ B ═ n (n) k-2ln L 1 ,BIC 2 ,BIC 3 .......BIC 26 };
Step 402D, based on steps 402B and 402C, selecting min { BIC 1 ,BIC 2 ,BIC 3 .......BIC 26 Pqd parameter set value corresponding to the ARIMA model fitting parameter of the next step;
step 403, based on step 402, assume { X } 1 ,X 2 ,X 3 .......X n Is the original time sequence, { YB 1 ,YB 2 ,YB 3 .......YB n Is ARIMA fitting sequence, then residual sequence ε is { YB ═ 1 -X 1 ,YB 2 - X 2 ,YB 3 -X 3 .......YB n -X n Get it before
Figure BDA0003686103650000041
If the abnormal value is judged to be 1, X i The abnormal point is the abnormal point, otherwise, the non-abnormal point is the abnormal point.
The missing value in step 1 and the abnormal value detection determination result of a certain data point i in the X data set in step 5 are 1, and different processing manners are made, specifically as follows:
step 501, if the point i is the missing point mentioned in the step 1, that is, if i is equal to p, then the data point is considered as an abnormal point, an abnormal value elimination process needs to be performed, and AB is marked i =1;
Step 502, if the detection results of the methods a and b of the point i are both normal points, namely YA i =X i And YB i =X i When the data points are all satisfied, the data points are considered as normal points, abnormal value elimination processing is not needed, and the AB is marked i 0, wherein AB representsFinally judging a result sequence;
step 503, if the detection results of the methods a and b of the point i are abnormal points, namely YA i ≠X i And YB i ≠X i When all the data points are satisfied, the data points are considered as abnormal points, abnormal value elimination processing is required, and AB is marked i =1;
Step 504, if the detection and determination results of the methods a and b of the point i are not consistent, that is, other than the above two cases, the data point is considered as a suspected abnormal point, and the next operation needs to be performed:
step 504A, if method a determines that point i is abnormal, and processed YA i And the original value X i The absolute value of the rate of difference exceeds n, i.e. 0
Figure BDA0003686103650000042
Where n is a positive number that can be autonomously determined by the user. Theoretically, the larger the value of n, the greater the user's acceptance of the data deviation. Then the point i is determined to be abnormal and marked AB i 1; otherwise, mark AB i =0;
Step 504B, if the method B judges the point i to be abnormal, the processed YB i And the original value X i The absolute value of the rate of difference exceeds n, i.e. 0
Figure BDA0003686103650000051
Then the point i is determined to be abnormal and marked AB i 1 is ═ 1; otherwise, mark AB i =0。
Compared with the prior art, the invention has the beneficial effects that:
1. the method adopts a method of combining the first-order difference and the ARIMA model to detect and correct the abnormal value, can combine the local characteristic and the full time domain characteristic to identify the abnormality, and has higher accuracy and stronger applicability;
2. from the overall technical scheme, the method can accurately and correctly detect the abnormal value of the time series data with seasonal characteristics, and is more suitable for popularization inside national network;
3. compared with the prior art, the method can store complete historical sequence data and provide a stable data source for the reliability of abnormal value correction;
4. the technology can simultaneously remove abnormal values of multiple indexes, and can greatly improve the working efficiency of users;
5. the determination of "anomalies" is relative, rather than absolute, and only correct. The invention provides a parameter adjusting function, which is convenient for a user to improve or reduce the standard for judging the abnormity according to the self requirement and the actual situation.
6. The calculation method is relatively small, and the running efficiency of the computer program is high.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a comparison graph of the original data and the data with the outliers removed.
Detailed Description
Taking the daily electric quantity of Hunan province as an example, firstly sequencing original data according to a time sequence, and marking a data position corresponding to each missing value; secondly, according to the sequence data with the arranged sequence numbers, taking the average value of front and back 5-bit non-empty numerical values as a preliminary fitting value of each missing value; then, carrying out data abnormal value detection and first correction on the well-regulated initial historical data through a first-order difference model and an ARIMA model respectively to obtain a corrected complete data set; thirdly, respectively comparing the difference of the two methods on the fitting result of the abnormal value, and confirming the position of the final data abnormal point; and finally, marking the numerical value corresponding to the determined abnormal point as a null value, and performing data fitting on the null value by using an ARIMA model to obtain a final complete time sequence data set, wherein the method comprises the following steps:
step 1, sequencing the daily electric quantity data of the Hunan province of the original time sequence according to the time sequence, and marking the data position corresponding to the missing value as shown in figure 2;
step 2, taking the mean value of front and back 5 non-empty numerical values of the missing value as a preliminary fitting value to obtain new time sequence data;
step 3, on the basis of the step 2, the sorted initial historical data is processed by a first-order difference modeType, i.e. method a, performs data abnormal value detection, and the data positions corresponding to the identified abnormal values are respectively designated as A i Instead, the abnormal value point is corrected for the first time by adopting a 'safety point' averaging method, and the specific steps are as follows:
step 301, calculating a standard deviation of a daily electric quantity data set X in Hunan province;
step 302, calculating a First order Difference vector XFD of the daily electricity data set X in Hunan province j =X i -X i-1
Step 303, based on step 301 and step 302, assuming that the tolerance of the user to the electric quantity deviation in the south-and-Hunan day is 1.25 times of the standard deviation, the specific steps of correcting the data of the abnormal point detected by the first-order difference model are as follows:
step 303A, in the process of detecting the abnormal value of the first-order difference model of the daily electrical quantity XFD data set in Hunan province, if the first-order difference of the ith point is more than 1.25 times of the standard difference, recording the judgment result of the abnormal value as 0, otherwise;
step 303B, based on step 303A, if the determination result is 1, performing numerical correction on the detected abnormal point by using a "safe point" average method; if the judgment result is 0, the original value is used;
and step 303C, circularly executing the step 303A and the step 303B until all the abnormal point detection results are 0, and terminating the circulation.
Step 304, based on step 303, obtaining a complete data set YA ═ YA { YA } after the daily electric quantity of the Hunan province is corrected based on the method a 1 ,YA 2 ,YA 3 .......YA n }。
And 4, on the basis of the step 2, carrying out data abnormal value detection on the sorted initial historical data by an ARIMA method, and respectively using B as a data position corresponding to the identified abnormal value i Instead, the method of 'ARIMA' is adopted to carry out second correction on the abnormal value point, and the specific steps are as follows:
step 401, determining an optimal fitting ARIMA model according to a Bayesian information criterion, and carrying out Bayesian information criterion inspection on a daily electric quantity data set X in Hunan province to obtain three parameters p, q and d of the ARIMA model, wherein the three parameters p, q and d are respectively 1, 1 and 0;
step 402, obtaining a Hunan daily electricity data set Y based on the fitting of the method a based on the results of the parameters determined in the step 401;
step 403, calculating residual error items of the data sets Y and X, setting the tolerance of the user to the power deviation in Hunan and south China to be 1.25 times of the standard deviation, and if the tolerance is not 1.25 times of the standard deviation
Figure BDA0003686103650000071
If the abnormal value is judged to be 1, X i The abnormal point is the abnormal point, otherwise, the non-abnormal point is the abnormal point;
and step 5, performing the final abnormal value detection based on the step S3 and the step S4, and determining the data position of the final abnormal value, wherein the specific steps are as follows:
step 501, if the point i is the missing point mentioned in the step 1, that is, if i is equal to p, then the data point is considered as an abnormal point, an abnormal value elimination process needs to be performed, and AB is marked i =1;
Step 502, if the detection results of the methods a and b of the point i are both normal points, namely YA i =X i And YB i =X i When all the data points are established, the data points are considered as normal points, abnormal value elimination processing is not needed, and AB is marked i =0;
Step 503, if the detection results of the methods a and b of the point i are abnormal points, namely YA i ≠X i And YB i ≠X i When all the data points are established, the data points are considered as abnormal points, abnormal value elimination processing is required, and AB is marked i =1;
Step 504, if the detection and determination results of the methods a and b of the point i are not consistent, that is, other than the above two cases, the data point is considered as a suspected abnormal point, and the next operation needs to be performed:
step 504A, if method a determines that point i is abnormal, and processed YA i And the original value X i If the absolute value of the rate of difference exceeds 1.25 times when the absolute value is 0, the point i is determined to be abnormal and the point AB is marked i 1 is ═ 1; otherwise, mark AB i =0。
Step 504B, the sameIf the method b judges the point i to be abnormal, the processed YB is i And the original value X i If the absolute value of the rate of difference exceeds 1.25 times when the absolute value is 0, the point i is determined to be abnormal and the point AB is marked i 1 is ═ 1; otherwise, mark AB i =0。
And 6, marking numerical values corresponding to the determined abnormal points as null values, and performing data fitting on the null values by using an ARIMA (autoregressive integrated moving average) model to obtain a final Hunan-division-industry daily electricity quantity complete time sequence data set Y ═ Y 1 ,Y 2 ,Y 3 .......Y n As shown in fig. 2.

Claims (4)

1. A detection and correction system for abnormal value of daily electricity quantity based on first-order difference and ARIMA method is characterized by comprising the following steps:
step 1, sorting the daily electric quantity data of a historical original time sequence according to a time sequence, and marking a data position p corresponding to a missing value of the daily electric quantity data;
and 2, for the sorted daily electric quantity data, taking the numerical value corresponding to the p point as a preliminary fitting value by using the average value of front and back 5-bit non-empty numerical values to obtain an initially sorted historical data X data set, and recording as X ═ X 1 ,X 2 ,X 3 .......X n };
Step 3, carrying out data abnormal value detection on the sorted initial historical data through a first-order difference model, namely a method a, and respectively using A to identify data positions corresponding to abnormal values i Instead, the abnormal value point is corrected for the first time by using a "safe point" averaging method, and the corrected complete data set YA is { YA ═ YA 1 ,YA 2 ,YA 3 .......YA n };
Step 4, carrying out data abnormal value detection on the sorted initial historical data through an ARIMA model, namely a method B, and respectively using a B to identify data positions corresponding to the abnormal values i Instead, the ARIMA method is adopted to carry out second correction on the abnormal value point, and the corrected complete data set YB is equal to { YB } 1 ,YB 2 ,YB 3 .......YB n };
Step 5, fitting the result YA i And YB i Performing final abnormal value detection, combining the above results, determining the data position of the final abnormal value, and marking as AB i
Step 6, for the determined abnormal point AB i Marking the corresponding numerical value as a null value, and performing data fitting on the null value by using an ARIMA model to obtain a final complete time sequence data set Y ═ Y 1 ,Y 2 ,Y 3 .......Y n }。
2. The system for detecting and correcting the abnormal value of the daily electricity quantity based on the first-order difference and ARIMA method as claimed in claim 1, wherein the step 3 is to detect and correct the abnormal point for the first time by the first-order difference model for the initially sorted historical data, and the specific steps are as follows:
step 301, calculate the standard deviation of the X data set, i.e.
Figure FDA0003686103640000011
Step 302, calculating a First order Difference vector First Difference of X, hereinafter referred to as FD, that is, XFD j =X i -X i-1 Then the first order difference data set XFD corresponding thereto is { XFD ═ XFD 1 ,XFD 2 ,XFD 3 .......XFD n };
Step 303, based on steps 301 and 302, the specific steps of correcting the data of the outliers detected by the first-order difference model are as follows:
step 303A, in the process of detecting the first-order difference model abnormal value of the XFD dataset, if the first-order difference of the ith point is greater than m times of the standard deviation, that is, XFD i >m*σ X Where m is a positive number that can be autonomously determined by the user. Theoretically, the larger the value of m, the greater the user's acceptance of the data deviation. Judging that the point is an abnormal point detected based on a first-order difference model, namely recording a judgment result as 1; otherwise, if the abnormal point is the abnormal point, the judgment result is marked as 0;
step 303B, based on the step 303A, in case that the judgment result is 1, the detected abnormity is subjected to a safety point averaging methodThe value is normally corrected, i.e.
Figure FDA0003686103640000012
Wherein the point i is an abnormal point, k is the first point before the point i is not an abnormal point, and j is the first point after the point i is not an abnormal point; in the case where the judgment result is 0, the original value, namely YA, is used i =X i
Step 303C, circularly executing steps 303A and 303B based on the result of step 303B, and terminating the circulation until the results of detecting the abnormal points are all 0;
step 304, based on step 303, obtains the first-order difference model modified complete data set YA ═ { YA ═ 1 ,YA 2 ,YA 3 .......YA n }。
3. The system for detecting and correcting abnormal value of daily electricity consumption based on first order difference and ARIMA method as claimed in claim 1, wherein: and 4, detecting and correcting the abnormal points of the initially sorted historical data for the second time through an ARIMA model, and specifically comprising the following steps:
step 401, establishing an ARIMA fitting model as
Figure FDA0003686103640000013
Explained variable
Figure FDA0003686103640000014
Figure FDA0003686103640000015
Wherein X i Influenced by the value of d, when d is 0, X t Is the original sequence data; when d is n, X t Is the corresponding n-order difference data sequence;
step 402, determining the best fit ARIMA model according to Bayesian information criterion BIC, which comprises the following steps:
step 402A, limiting the value ranges of p, q and d to be 0,2, except the case that p, q and d are 0;
step 402B, based on the p, q and d parameter values selected in step 402A, performs ARIMA regression model, resulting in p, q and d combination based parameter set pqd ═ { pqd ═ 1 ,pqd 2 ,pqd 3 .......pqd 26 };
Step 402C, based on the step 402B, calculating a BIC value, where BIC is ln (n) k-2ln L, where L is a maximum likelihood function value corresponding to the current dimension model, and n and k are a sample size and a parameter number, respectively, to obtain BIC ═ { BIC ═ B ═ n (n) k-2ln L 1 ,BIC 2 ,BIC 3 .......BIC 26 };
Step 402D, based on steps 402B and 402C, select min { BIC 1 ,BIC 2 ,BIC 3 .......BIC 26 Pqd parameter set values corresponding to the parameters for entering the next ARIMA model fitting;
step 403, based on step 402, assume { X } 1 ,X 2 ,X 3 .......X n Is the original time sequence, { YB 1 ,YB 2 ,YB 3 .......YB n Is ARIMA fitted sequence, then the residual sequence epsilon is YB 1 -X 1 ,YB 2 -X 2 ,YB 3 -X 3 .......YB n -X n Get it before
Figure FDA0003686103640000021
Figure FDA0003686103640000022
If the abnormal value is judged to be 1, X i An outlier is identified, whereas a non-outlier is identified.
4. The system for detecting and correcting abnormal value of daily electricity consumption based on first order difference and ARIMA method as claimed in claim 1, wherein: the missing value in step 1 and the abnormal value detection determination result of a certain data point i in the X data set in step 5 are 1, and different processing manners are made, specifically as follows:
step 501, if the point i is the missing point mentioned in the step 1, that is, if i is equal to p, then the data point is considered as an abnormal point, and an abnormal value needs to be madeRejecting and marking AB i =1;
Step 502, if the detection results of the methods a and b of the point i are both normal points, namely YA i =X i And YB i =X i When all the data points are satisfied, the data points are considered as normal points, abnormal value elimination processing is not needed, and AB is marked i 0, wherein AB represents the final sequence of the determination;
step 503, if the detection results of the methods a and b of the point i are abnormal points, namely YA i ≠X i And YB i ≠X i If all the data points are satisfied, the data points are considered as abnormal points, abnormal value elimination processing is required, and AB is marked i =1;
Step 504, if the detection and determination results of the methods a and b of the point i are not consistent, that is, other than the above two cases, the data point is considered as a suspected abnormal point, and the next operation needs to be performed:
step 504A, if method a determines that point i is abnormal, and processed YA i And the original value X i The absolute value of the rate of difference exceeds n, i.e. 0
Figure FDA0003686103640000023
Where n is a positive number that can be autonomously determined by the user. Theoretically, the larger the value of n, the greater the user's acceptance of the data deviation. Then the decision point i is abnormal and the flag AB i 1 is ═ 1; otherwise, mark AB i =0;
Step 504B, if the method B judges the point i to be abnormal, the processed YB i And the original value X i The absolute value of the rate of difference exceeds n, i.e. 0
Figure FDA0003686103640000024
Then the point i is determined to be abnormal and marked AB i 1 is ═ 1; otherwise, mark AB i =0。
CN202210650951.8A 2022-06-09 2022-06-09 Solar electricity abnormal value detection and correction system based on first-order difference and ARIMA method Active CN115081840B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210650951.8A CN115081840B (en) 2022-06-09 2022-06-09 Solar electricity abnormal value detection and correction system based on first-order difference and ARIMA method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210650951.8A CN115081840B (en) 2022-06-09 2022-06-09 Solar electricity abnormal value detection and correction system based on first-order difference and ARIMA method

Publications (2)

Publication Number Publication Date
CN115081840A true CN115081840A (en) 2022-09-20
CN115081840B CN115081840B (en) 2024-07-02

Family

ID=83252053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210650951.8A Active CN115081840B (en) 2022-06-09 2022-06-09 Solar electricity abnormal value detection and correction system based on first-order difference and ARIMA method

Country Status (1)

Country Link
CN (1) CN115081840B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117220416A (en) * 2023-11-03 2023-12-12 国网湖北省电力有限公司武汉供电公司 Smart power grid electric power information safety transmission system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016101690A1 (en) * 2014-12-22 2016-06-30 国家电网公司 Time sequence analysis-based state monitoring data cleaning method for power transmission and transformation device
CN110083803A (en) * 2019-04-22 2019-08-02 水利部信息中心 Based on Time Series AR IMA model water intaking method for detecting abnormality and system
WO2020019403A1 (en) * 2018-07-26 2020-01-30 平安科技(深圳)有限公司 Electricity consumption abnormality detection method, apparatus and device, and readable storage medium
CN111144435A (en) * 2019-11-11 2020-05-12 国电南瑞科技股份有限公司 Electric energy abnormal data monitoring method based on LOF and verification filtering framework
CN111563776A (en) * 2020-05-08 2020-08-21 国网江苏省电力有限公司扬州供电分公司 Electric quantity decomposition and prediction method based on K neighbor anomaly detection and Prophet model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016101690A1 (en) * 2014-12-22 2016-06-30 国家电网公司 Time sequence analysis-based state monitoring data cleaning method for power transmission and transformation device
WO2020019403A1 (en) * 2018-07-26 2020-01-30 平安科技(深圳)有限公司 Electricity consumption abnormality detection method, apparatus and device, and readable storage medium
CN110083803A (en) * 2019-04-22 2019-08-02 水利部信息中心 Based on Time Series AR IMA model water intaking method for detecting abnormality and system
CN111144435A (en) * 2019-11-11 2020-05-12 国电南瑞科技股份有限公司 Electric energy abnormal data monitoring method based on LOF and verification filtering framework
CN111563776A (en) * 2020-05-08 2020-08-21 国网江苏省电力有限公司扬州供电分公司 Electric quantity decomposition and prediction method based on K neighbor anomaly detection and Prophet model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
方海泉;薛惠锋;蒋云钟;周铁军;万毅;王海宁;: "基于EEMD的水资源监测数据异常值检测与校正", 农业机械学报, no. 09, 31 December 2017 (2017-12-31) *
韩锋;杨飞;燕重阳;李国亮;: "基于时间序列分析的用户异常供用电自动监测系统设计", 自动化与仪器仪表, no. 03, 25 March 2020 (2020-03-25) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117220416A (en) * 2023-11-03 2023-12-12 国网湖北省电力有限公司武汉供电公司 Smart power grid electric power information safety transmission system
CN117220416B (en) * 2023-11-03 2024-01-16 国网湖北省电力有限公司武汉供电公司 Smart power grid electric power information safety transmission system

Also Published As

Publication number Publication date
CN115081840B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
CN106682079B (en) User electricity consumption behavior detection method based on cluster analysis
CN112699913A (en) Transformer area household variable relation abnormity diagnosis method and device
CN116243097B (en) Electric energy quality detection method based on big data
CN111401460B (en) Abnormal electric quantity data identification method based on limit value learning
CN111784093B (en) Enterprise reworking auxiliary judging method based on power big data analysis
CN111144435A (en) Electric energy abnormal data monitoring method based on LOF and verification filtering framework
CN108074015B (en) Ultra-short-term prediction method and system for wind power
CN111539657B (en) Typical power industry load characteristic classification and synthesis method combined with user daily electricity quantity curve
CN111046913A (en) Load abnormal value identification method
CN115081840A (en) Daily electric quantity abnormal value detection and correction system based on first-order difference and ARIMA method
CN114114039A (en) Method and device for evaluating consistency of single battery cells of battery system
CN114330583A (en) Abnormal electricity utilization identification method and abnormal electricity utilization identification system
CN117131022B (en) Heterogeneous data migration method of electric power information system
CN108537249B (en) Industrial process data clustering method for density peak clustering
CN107274025B (en) System and method for realizing intelligent identification and management of power consumption mode
CN114266457A (en) Method for detecting different loss inducement of distribution line
CN117951619A (en) User electricity behavior analysis method and system based on outlier detection and k-means combination
CN112149052B (en) Daily load curve clustering method based on PLR-DTW
CN113554079A (en) Electric power load abnormal data detection method and system based on secondary detection method
CN117131341A (en) Electric meter operation error estimation method, and out-of-tolerance electric meter judgment method and system
CN110399926B (en) Street lamp fault diagnosis method and device
CN110045250B (en) Method and system for judging insulation state of power cable
CN110334125A (en) A kind of power distribution network measurement anomalous data identification method and device
CN106814608B (en) Predictive control adaptive filtering algorithm based on posterior probability distribution
CN115081511A (en) Screw tightening quality judgment method based on curve similarity and clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant