CN116304949A - Calibration method for energy consumption historical data - Google Patents

Calibration method for energy consumption historical data Download PDF

Info

Publication number
CN116304949A
CN116304949A CN202310329227.XA CN202310329227A CN116304949A CN 116304949 A CN116304949 A CN 116304949A CN 202310329227 A CN202310329227 A CN 202310329227A CN 116304949 A CN116304949 A CN 116304949A
Authority
CN
China
Prior art keywords
data
distance
point
energy consumption
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310329227.XA
Other languages
Chinese (zh)
Inventor
汪红亮
刘钢
罗鹏鑫
李沁贇
谭灿
祝娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Yuanchuang Intelligent Control Technology Co ltd
Original Assignee
Zhejiang Yuanchuang Intelligent Control Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Yuanchuang Intelligent Control Technology Co ltd filed Critical Zhejiang Yuanchuang Intelligent Control Technology Co ltd
Priority to CN202310329227.XA priority Critical patent/CN116304949A/en
Publication of CN116304949A publication Critical patent/CN116304949A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a calibration method of energy consumption historical data, which comprises the steps of firstly, chrononizing the energy consumption data, screening out unreliable data at the same time by combining an outlier factor LOF principle, substituting effective information in abnormal data, and simulating middle calibration data by a model prediction algorithm. The method has the characteristics of effectively improving the accuracy of judging the abnormal data and fully utilizing the effective information in the abnormal data.

Description

Calibration method for energy consumption historical data
Technical Field
The invention relates to the field of real-time historical data of an energy management system, in particular to a calibration method of energy consumption historical data.
Background
The energy management system is in butt joint with several common meters, including an ammeter, a water meter, a gas meter and an energy meter, and monitors, manages, counts, analyzes, executes an energy-saving strategy and the like by collecting meter data. The period of data acquisition according to the requirements of actual projects is generally between 1 minute and 15 minutes. In order to realize the service requirement, the collected real-time data is stored to be historical data. In practical project implementation and use, several problems are likely to occur:
(1) The collector cannot guarantee to provide 365×24 hours of service, and data loss caused by equipment damage, power failure, network disconnection and the like is not eliminated.
(2) The 0 error in the engineering implementation configuration and the use process of a user cannot be guaranteed, the configuration and use problems possibly cause the occurrence of abnormality of the collected data, the data is continuously accumulated along with the time, and finally the abnormal data is difficult to find and difficult to repair after finding.
In order to solve the above technical problems, researchers have begun to adopt various methods to try to solve the problems. The patent name is "a method, a system and a device for filling missing values of electricity collection data". The method comprises the steps that the average value-variance method is utilized to detect abnormal values in the power utilization acquisition data, and then the abnormal power utilization acquisition data are deleted; then, the power consumption acquisition data is utilized to train a noise reduction self-encoder model, the original power consumption sample data is reconstructed based on the trained noise reduction self-encoder network model, and the reconstruction data is utilized to fill the missing power consumption acquisition sample data; preventing the model from being over fitted, and providing a new Determination-FourOrder regularization term; in order to obtain a better noise attenuation ratio, reducing the noise level according to the number of units of the network layer; then, the k-means clustering method, the average distance from the adjacent data points to the cluster-like center, and the standard deviation of the data are combined to correct the filled data values, which has the defects that:
1) The patent detects abnormal values in the electricity collection data according to the mean-variance method, and directly deletes the abnormal values, but in practice, the abnormal data are judged in a wrong way.
2) The exception data is discarded directly, resulting in some valid information not being extracted and utilized.
Therefore, the conventional technique has a problem that erroneous judgment is likely to occur and effective information in abnormal data cannot be used.
Disclosure of Invention
The invention aims to provide a calibration method of energy consumption historical data. The method has the characteristics of effectively improving the accuracy of judging the abnormal data and fully utilizing the effective information in the abnormal data.
The technical scheme of the invention is as follows: the method comprises the steps of firstly, time-sequencing energy consumption data, screening out unreliable data at the same time by combining an outlier factor LOF principle, substituting effective information in abnormal data, and simulating intermediate calibration data by a model prediction algorithm.
The foregoing method for calibrating the energy consumption history data comprises the following specific steps:
A. acquiring historical energy consumption data of a single meter, converting the historical energy consumption data into a time sequence, and calculating a reading value of each moment and an increment of a corresponding time zone to obtain a data point of each moment of the meter;
B. quantifying the degree of anomaly of each data point based on a LOF algorithm of the time sequence;
C. the user confirms the abnormal degree of the data points and backfills the effective information;
D. modeling calibration data through the trend of the contemporaneous historical data;
E. the user validates after confirmation.
In the foregoing method for calibrating energy consumption history data, the step B includes the following specific steps:
b1, for each data point, calculating the distance between other all data points and sorting from near to far;
b2, then find its k-nearest-neighbor for each data point according to the ordering, calculate the LOF score.
In the foregoing calibration method for energy consumption history data, the LOF score, that is, the specific calculation formula of the local anomaly factor is:
the local anomaly factor for point p is expressed as:
Figure BDA0004154339730000031
the local reachable density of point p is expressed as:
Figure BDA0004154339730000032
wherein the kth distance neighborhood Nk (p) of the point p is the kth distance of p, i.e. all points within, including the kth distance; the number of k-th neighborhood points of p, |nk (p) | > = k;
the kth reachable distance from point o to point p is defined as:
reach-distance k (p,o)=max{k-distance(o),d(p,o)},
wherein d (P, O) represents the distance between two points P and O; k-distance represents the K-nearest distance, wherein the distance between the K nearest point and the point p is the K-adjacent distance of the point p, and is denoted as K-distance (p);
the kth distance dk (p) for point p is defined as follows: dk (p) =d (p, o).
In the foregoing method for calibrating energy consumption history data, in step C, the effective information includes a read value at a certain time and an increment of a certain time width.
Compared with the prior art, the method has the advantages that the time factors are added in the classical density-based algorithm Local Outlier Factor, the energy consumption data of each table are arranged according to time sequence, and whether the energy consumption data point at the moment is an outlier is judged by distributing an outlier factor (local anomaly factor) LOF which depends on the neighborhood density to the energy consumption data point at each moment. Modeling valid historical data, predicting missing data and reassigning anomalous data by bringing in valid information of the anomalous data. The advantages are as follows:
1) The degree of anomaly (outlierness) for each data point can be quantified;
2) The abnormal data are only used for identifying possible abnormal data and abnormal degree, so that a user can conveniently and quickly position the abnormal data, manual confirmation is allowed, and excessive correction is avoided;
3) The effective information in the abnormal data is fully utilized, and is brought into the model to perform time sequence arrangement on the middle missing data and the abnormal data through predictive analysis.
In summary, the method and the device have the characteristics of effectively improving the accuracy of judging the abnormal data and fully utilizing the effective information in the abnormal data.
Drawings
FIG. 1 is a flowchart of the specific steps of the present invention;
FIG. 2 is a graph of a table of the present invention with abnormal energy consumption delta data trends.
FIG. 3 is a graph of the present invention with abnormal energy consumption delta data calibrated trend;
fig. 4 is a schematic diagram of the kth reachable distance from point o to point p.
Detailed Description
The invention is further illustrated by the following figures and examples, which are not intended to be limiting.
Examples. The invention provides a method for detecting abnormality and calibrating data of water, electricity, gas and energy consumption historical data, which is shown in a flow chart of the method in the attached figure 1, and mainly comprises the following steps:
s1: historical energy consumption data of a single meter are obtained and converted into a time sequence, and the reading value of each moment and the increment of the corresponding time zone are calculated.
S2: for each data point, the distances to all other points are calculated and ordered from near to far.
S3: for each data point, find its K-nearest-neighbor (K nearest neighbor, meaning K nearest neighbors), calculate the LOF score.
The LOF score, i.e., the specific calculation formula of the local anomaly factor, is:
d (P, O) represents the distance between two points P and O; k-distance represents the kth distance;
among the points nearest to the data point p, the K-nearest distance between the kth nearest point and the point p is the K-adjacent distance of the point p, denoted as K-distance (p);
the kth distance dk (p) for point p is defined as follows: dk (p) =d (p, o), and satisfies the following condition:
(a) At least k points o epsilon C { x not equal to p } which do not contain p in the set, and d (p, o') is less than or equal to d (p, o);
(b) K-1 points o epsilon C { x not equal to p } which do not include p at most in the set, and d (p, o')notmore than d (p, o);
the kth distance of P, i.e. the distance from the kth point of P, does not include P.
(3) k-distance neighborhood of p: kth distance neighborhood
The kth distance neighborhood Nk (p) of a point p is the kth distance of p, i.e., all points within, including the kth distance.
The number of k-th neighborhood points of p, |nk (p) | > =k.
(4) reach-distance: reach distance
Reachable distance (Reachablity distance): the definition of the reachable distance is related to the K-neighbor distance, and given a parameter K, the reachable distance reach-dist (p, o) of data point p to data point o is the K-neighbor distance of data point o and the maximum of the direct distance between data point p and point o.
The kth reachable distance from point o to point p is defined as:
reach-distanCe k (p,o)=max{k-distance(o),d(p,o)}
that is, the kth reachable distance from point o to point p is at least the kth distance of o, or the true distance between o, p. This also means that the k points nearest to point o, o to their reachable distances are considered equal and all equal to dk (o). O as shown in FIG. 4 below 1 The 5 th reachable distance to p is d (p, o 1 ),o 2 The 5 th reachable distance to p is d 5 (o 2 )。
(5) local reachablity density: local reachable density
Local reachable density (local reachablity density): the definition of local reachable density is based on reachable distance, and for data point p, those data points with a distance from the point p less than or equal to K-distance (p) are called its K-nearest-neighbor, denoted as Nk (p), and the local reachable density of data point p is the inverse of its average reachable distance from neighboring data points.
The local reachable density of point p is expressed as:
Figure BDA0004154339730000071
representing the inverse of the average reachable distance from point to p in the kth neighborhood of point p.
Note that: the distance from the neighbor point Nk (p) of p to p is not the distance from p to Nk (p), and the relationship must be clarified. And if there are repeat points, the sum of the reachable distances of the denominators is possibly 0, which will result in ird becoming infinite, as will be further mentioned below.
The meaning of this value can be understood by first representing a density, the higher the density, the more likely we consider to belong to the same cluster, the lower the density, the more likely to be outliers, the more likely the reachable distance is a smaller dk (o) if p and surrounding neighborhood points are the same cluster, resulting in a smaller sum of reachable distances and a higher density value; if p and surrounding neighbor points are far apart, the reachable distances may both take on larger values d (p, o), resulting in a smaller density, more likely to be outliers.
(6) local outlier factor: local outlier factor
Local Outlier Factor: according to the definition of local reachable density, if one data point is far from the other, it is apparent that its local reachable density is small. The LOF algorithm measures the degree of abnormality of a data point, not its absolute local density, but its relative density to surrounding neighboring data points. This has the advantage of allowing for non-uniform data distribution and different densities. Local anomaly factors are defined by both local relative densities. The local relative density (local anomaly factor) of a data point p is the ratio of the average local reachable density of neighbors of the point p to the local reachable density of the data point p.
The local outlier factor of point p is expressed as:
Figure BDA0004154339730000081
represents the average of the ratio of the local reachable density of the neighborhood point Nk (p) of the point p to the local reachable density of the point p.
LOF reflects the degree of abnormality of a sample, primarily by calculating a numerical score. This value generally means: the average density of the locations of the sample points around a sample point is compared to the density of the locations of the sample points. If the ratio is closer to 1, the density of the neighborhood points of p is almost the same as that of the neighborhood, and p is possibly in the same cluster with the neighborhood; if the ratio is smaller than 1, the density of p is higher than that of the neighborhood point, and p is a density point; if this ratio is greater than 1, the density of p is less than the density of its neighborhood points, and p is more likely to be an outlier.
S4: the user confirms and backfills the effective information: a read value at a certain time, an increment of a certain time width.
S5: calibration data is simulated by contemporaneous historical data trends.
S6: validation is confirmed.
According to the invention, firstly, the energy consumption data is time-sequenced, then the unreliable data at the same time is screened out by combining with an outlier factor LOF principle, then effective information in abnormal data is substituted, and intermediate calibration data is simulated by a model prediction algorithm. Thus, the degree of abnormality of each data point can be quantified; the abnormal data are only used for identifying possible abnormal data and abnormal degree, so that a user can conveniently and quickly position the abnormal data, manual confirmation is allowed, and excessive correction is avoided; the effective information in the abnormal data is fully utilized, and is brought into the model to perform time sequence arrangement on the middle missing data and the abnormal data through predictive analysis.

Claims (5)

1. The method for calibrating the energy consumption historical data is characterized by comprising the following steps of: firstly, the energy consumption data is time-sequenced, then the unreliable data at the same time is screened out by combining with an outlier factor LOF principle, then effective information in abnormal data is substituted, and intermediate calibration data is simulated by a model prediction algorithm.
2. A method for calibrating energy consumption history data according to claim 1, comprising the specific steps of:
A. acquiring historical energy consumption data of a single meter, converting the historical energy consumption data into a time sequence, and calculating a reading value of each moment and an increment of a corresponding time zone to obtain a data point of each moment of the meter;
B. quantifying the degree of anomaly of each data point based on a LOF algorithm of the time sequence;
C. the user confirms the abnormal degree of the data points and backfills the effective information;
D. modeling calibration data through the trend of the contemporaneous historical data;
E. the user validates after confirmation.
3. A method for calibrating energy consumption history data according to claim 2, wherein step B comprises the following steps:
b1, for each data point, calculating the distance between other all data points and sorting from near to far;
b2, then find its k-nearest-neighbor for each data point according to the ordering, calculate the LOF score.
4. A method for calibrating energy consumption history data according to claim 3, wherein the LOF score, i.e. the specific calculation formula of the local anomaly factor, is:
the local anomaly factor for point p is expressed as:
Figure FDA0004154339660000021
the local reachable density of point p is expressed as:
Figure FDA0004154339660000022
wherein the kth distance neighborhood Nk (p) of the point p is the kth distance of p, i.e. all points within, including the kth distance; the number of k-th neighborhood points of p, |nk (p) | > = k;
the kth reachable distance from point o to point p is defined as:
reach-distance k (p,o)=max{k-distance(o),d(p,o)},
wherein d (P, O) represents the distance between two points P and O; k-distance represents the K-nearest distance, wherein the distance between the K nearest point and the point p is the K-adjacent distance of the point p, and is denoted as K-distance (p);
the kth distance dk (p) for point p is defined as follows: dk (p) =d (p, o).
5. A method of calibrating energy consumption history according to claim 2, wherein: in step C, the valid information includes a read value at a certain time and an increment of a certain time width.
CN202310329227.XA 2023-03-30 2023-03-30 Calibration method for energy consumption historical data Pending CN116304949A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310329227.XA CN116304949A (en) 2023-03-30 2023-03-30 Calibration method for energy consumption historical data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310329227.XA CN116304949A (en) 2023-03-30 2023-03-30 Calibration method for energy consumption historical data

Publications (1)

Publication Number Publication Date
CN116304949A true CN116304949A (en) 2023-06-23

Family

ID=86822281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310329227.XA Pending CN116304949A (en) 2023-03-30 2023-03-30 Calibration method for energy consumption historical data

Country Status (1)

Country Link
CN (1) CN116304949A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116660667A (en) * 2023-07-26 2023-08-29 山东金科电气股份有限公司 Transformer abnormality monitoring method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116660667A (en) * 2023-07-26 2023-08-29 山东金科电气股份有限公司 Transformer abnormality monitoring method and system
CN116660667B (en) * 2023-07-26 2023-10-24 山东金科电气股份有限公司 Transformer abnormality monitoring method and system

Similar Documents

Publication Publication Date Title
CN106682763B (en) Power load optimization prediction method for large amount of sample data
CN106570790B (en) Wind power plant output data restoration method considering wind speed data segmentation characteristics
CN116304949A (en) Calibration method for energy consumption historical data
CN112434359A (en) High-speed railway pier settlement curve prediction method and system
CN112035544A (en) Power distribution network data anomaly monitoring and diagnosis method
CN110738346A (en) batch electric energy meter reliability prediction method based on Weibull distribution
CN112258337A (en) Self-complementing and self-correcting base station energy consumption model prediction method
CN111339661B (en) Automatic planning method for high-voltage cable inspection cycle
CN110969306A (en) Power distribution low-voltage distribution area load prediction method and device based on deep learning
CN113689004A (en) Underground pipe network bearing capacity evaluation method and system based on machine learning
CN111091223A (en) Distribution transformer short-term load prediction method based on Internet of things intelligent sensing technology
CN116663871B (en) Method and system for predicting electricity demand
CN112345972A (en) Power failure event-based power distribution network line transformation relation abnormity diagnosis method, device and system
CN116128145A (en) Power equipment state maintenance strategy optimization method
CN113919610A (en) ARIMA model construction method and evaluation method for low-voltage transformer area line loss prediction
CN114970904B (en) Digital adjustment method for contact network operation and maintenance resources based on defect processing
CN111983478A (en) Electrochemical energy storage power station SOC anomaly detection method based on Holt linear trend model
CN107563641B (en) Disaster-resistant multi-scene differentiation planning method for power distribution network considering disaster preference
Zhou et al. Quantification of value of information associated with optimal observation actions within partially observable Markov decision processes
CN111459925A (en) Combined interpolation method for park comprehensive energy abnormal data
CN113378102A (en) Data missing preprocessing method, medium and application for short-term load prediction
El-Thalji et al. A model for assessing operation and maintenance cost adapted to wind farms in cold climate environment: based on onshore and offshore case studies
CN112330089B (en) Comprehensive energy efficiency monitoring method and monitoring system for equipment manufacturing enterprises
CN112365053B (en) Method, system and computer readable medium for predicting total power of distributed photovoltaic power generation in load region
CN110175705B (en) Load prediction method and memory and system comprising same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination