CN111813766A - Detection and processing method for abnormal data of gas quantity - Google Patents

Detection and processing method for abnormal data of gas quantity Download PDF

Info

Publication number
CN111813766A
CN111813766A CN202010593482.1A CN202010593482A CN111813766A CN 111813766 A CN111813766 A CN 111813766A CN 202010593482 A CN202010593482 A CN 202010593482A CN 111813766 A CN111813766 A CN 111813766A
Authority
CN
China
Prior art keywords
data
gas
abnormal
window
gas quantity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010593482.1A
Other languages
Chinese (zh)
Inventor
栗风永
孙猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Electric Power
Shanghai Electric Power University
University of Shanghai for Science and Technology
Original Assignee
Shanghai Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Electric Power University filed Critical Shanghai Electric Power University
Priority to CN202010593482.1A priority Critical patent/CN111813766A/en
Publication of CN111813766A publication Critical patent/CN111813766A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Examining Or Testing Airtightness (AREA)

Abstract

The invention relates to a method for detecting and processing abnormal data of gas quantity, which comprises the following steps: acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set so as to screen out primary normal data and vacant data; carrying out data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data; constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data; and performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed. Compared with the prior art, the method can accurately and reliably detect the abnormal data through two screening, marking and data interpolation operations, and can keep the integrity and continuity of the whole gas volume data.

Description

Detection and processing method for abnormal data of gas quantity
Technical Field
The invention relates to the technical field of gas quantity statistics, in particular to a method for detecting and processing abnormal gas quantity data.
Background
The gas volume data is statistics of gas consumption of each user by a gas enterprise, is a basis for user behavior analysis and user gas consumption prediction, and due to the fact that the gas consumption is increased year by year due to the fact that the natural gas is used more and more nowadays, more and more problems are shown in the statistics of the gas volume data of the gas enterprise, wherein the most main problems are as follows:
1. in the statistical process, the recorded gas volume data has a large deviation from the gas volume data under the actual condition due to factors such as maintenance, equipment failure, policy adjustment and equipment updating;
2. the fluctuation of the gas quantity data is large, and the distribution of the gas quantity data is irregular due to irregular gas consumption of part of users, so that the detection difficulty of the abnormal value of the gas quantity data is large;
3. the related methods for processing the abnormal values of the current gas quantity data are fewer, and the attention of enterprises to the processing of the abnormal values of the gas quantity data is lower.
Most gas enterprises use a traditional analysis method for processing abnormal values of gas volume data, and use a traditional descriptive statistical analysis method or a box diagram analysis method and other general methods for detecting abnormal data, wherein the methods can detect partial abnormal data, but have the cost of removing a large amount of normal data, and have no detection for abnormal data with weak regularity and small deviation with the overall data; because a large amount of normal data are removed and partial abnormal data exist, the whole gas quantity data is lost, so that great influence is generated on the gas quantity prediction of the user and the behavior analysis of the user, and the accuracy of the prediction and analysis is reduced.
Disclosure of Invention
The present invention aims to overcome the above-mentioned drawbacks of the prior art and provide a method for detecting and processing abnormal gas quantity data, so as to accurately detect the abnormal data and correspondingly process the abnormal data, thereby ensuring the integrity and continuity of the gas quantity data.
The purpose of the invention can be realized by the following technical scheme: a method for detecting and processing abnormal data of gas quantity comprises the following steps:
s1, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out primary normal data and vacant data;
s2, performing data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data;
s3, constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data;
and S4, performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed.
Further, the step S1 specifically includes the following steps:
s11, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out initial normal data and mark vacant data and data with a value smaller than 0, wherein the acquired gas quantity data set specifically comprises the following steps:
D=[d1,d2,…di,…dn]
wherein D is the acquired air volume data set, DiN is the total number of the acquired gas volume data;
and S12, deleting the data with the value less than 0 and marking the data as the vacant data.
Further, the step S2 specifically includes the following steps:
s21, performing data interpolation on the blank data by using adjacent primary normal data before and after the blank data by using an adjacent mean interpolation method;
and S22, combining the data interpolated in the step S21 and all the initial normal data into a first air volume data set.
Further, the specific process of performing data interpolation on the blank data in step S21 is as follows: for the a-th gas quantity data d marked as vacant dataaThe first primary normal data d appearing at the front position thereof is selected in order from right to lefta-qAnd selecting the first primary normal data d appearing at the rear position thereof in order from left to righta+qThen, the interpolated data value is obtained as:
Figure BDA0002556618940000021
wherein, CaSupplying gas quantity data d for interpolationaThe data value of (2).
Further, the step S3 specifically includes the following steps:
s31, establishing a window function model by setting the size of windows, the number of windows, window function judgment conditions and a window sliding mode;
and S32, sequentially judging each data in the first air volume data set to be secondary normal data or abnormal fluctuation data according to a sliding mode and by combining a window function judgment condition.
Further, the number of the windows is n-m, where n is the total number of the gas volume data, and m is the size of the window, that is, the total number of the data in the window, the window may be defined as:
Wj=[dj,dj+1,…dj+m-1]
j∈[1,n-m]
wherein, WjIs the jth window, djIs the jth gas volume data, i.e. the first data of the jth window, and so on, dj+1Second data for the jth window, dj+m-1The mth data of the jth window.
Further, the data in the window are all non-null data.
Further, the window function determination condition is specifically:
Figure BDA0002556618940000031
Figure BDA0002556618940000032
wherein, Wj_meanIs the average of all data in the jth window, dj+mAnd E is a preset fluctuation deviation, wherein the j + m gas quantity data is the first data outside the j window.
Further, the window sliding manner is specifically that the window sequentially slides from the first data to the last data of the data set.
Further, the step S4 specifically includes the following steps:
s41, deleting the abnormally fluctuated data, and marking the abnormally fluctuated data as vacant data;
s42, performing data interpolation on the blank data by using adjacent secondary normal data before and after the blank data by using an adjacent mean interpolation method;
and S43, combining the data interpolated in the step S42 and all secondary normal data into a second air volume data set.
Compared with the prior art, the invention has the following advantages:
the method disclosed by the invention has the advantages that the window function idea is utilized, the gas volume data set is segmented to realize independent judgment of a single subdata segment, and compared with an abnormal detection mode of the whole gas volume data, the method improves the accuracy of abnormal detection; meanwhile, the method can deal with irregular air volume data, and the adaptability is obviously enhanced.
Aiming at the characteristics of the gas quantity data, firstly, an integrity detection method is used for carrying out primary screening on vacancy and data with the numerical value smaller than 0 and marking all the data as vacancy data, then, a window function method is used for carrying out secondary screening on abnormal fluctuation data and marking the abnormal fluctuation data as vacancy data again, the accuracy of abnormal value detection can be guaranteed through multiple screening, meanwhile, the misjudgment probability of normal data is reduced, and the reliability of abnormal data detection is improved.
And thirdly, performing data interpolation processing on the gas volume data marked as the vacant data by adopting an adjacent mean interpolation method, thereby ensuring the integrity and the continuity of the whole gas volume data set, and being beneficial to accurately predicting the gas volume of the user and analyzing the user behavior in the follow-up process.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic diagram of an embodiment of an application process;
FIG. 3a is a schematic diagram of actual abnormal data of sample-gas volume data in the embodiment;
FIG. 3b is a schematic diagram illustrating the results of detecting and processing the gas quantity data of the sample according to the method of the present invention in the embodiment;
FIG. 4a is a schematic diagram of actual abnormal data of sample two-gas volume data in the embodiment;
FIG. 4b is a schematic diagram illustrating the results of detecting and processing the gas quantity data of the second sample by the method of the present invention in the embodiment;
FIG. 5a is a schematic diagram of actual abnormal data of sample three-gas volume data in the embodiment;
FIG. 5b is a schematic diagram illustrating the results of detecting and processing the gas quantity data of the second sample by the method of the present invention in the embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
Examples
As shown in fig. 1, a method for detecting and processing abnormal gas quantity data includes the following steps:
s1, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out primary normal data and vacant data;
s2, performing data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data;
s3, constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data;
and S4, performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed.
The method of the invention is applied to practice, and the specific process is shown in figure 2:
(a) carrying out integrity detection on input gas quantity data;
(b) establishing a window function model;
(c) abnormal value detection is carried out on the input gas quantity data;
(d) processing abnormal values;
step (a) is further detailed below:
(a1) carrying out data integrity detection on input data, and marking vacant data;
(a2) carrying out data integrity detection on input data, and marking data smaller than 0;
(a3) deleting the data smaller than 0 and marking the data as vacant data;
(a4) performing adjacent mean interpolation on the vacant data;
(a5) and storing all the gas volume data including the interpolation data and the normal data.
Step (a1) is further detailed below:
(a11) the input gas volume data is set D, the data set number is n, D ═ Di,di,…dn];
(a12) Detecting the vacant data and marking as null;
step (a4) is further detailed below:
(a41) marking missing data as nullaData of adjacent left and right sides is da-1,da+1
(a42) According to the formula:
Figure BDA0002556618940000051
wherein, CaTo interpolate to nullaSo as to update nullaIf d isa-1If the data is marked as vacant, continue to fetch d to the lefta-2Until normal data appears, if da+1Marked as vacant data, then continue to fetch d to the righta+2Until normal data is present.
Step (b) is further detailed below:
(b1) setting the size m of the window, namely setting the total number of data in the window as m;
(b2) setting a judgment condition of the window function;
(b3) the sliding mode of the window function is set as follows: sliding from the head end to the tail end of the data in sequence;
(b4) the window function is saved.
Step (b2) is further detailed below:
(b21) setting window WjThe number of windows is n-m, the j (j belongs to [1, n-m ]]) The individual windows are defined as: wj=[dj,dj+1,…dj+m-1]According to the formula:
Figure BDA0002556618940000061
finding the Window WjMean value W of all data injIf W isjIf null exists in the data, deleting null, and taking a non-null value to the left to ensure WjThe number of data in the data is m, and the data has no null value;
(b22) setting a decision formula in a window function, calculating a window and next data d outside the windowj+mDeviation of (2):
Figure BDA0002556618940000062
step (c) is further detailed below:
(c1) setting a value of a preset deviation E (the deviation values of different gas quantity data are different);
(c2) sequentially judging the data d outside the window from j to 1 according to the judgment condition of the window functionj+mWhether it is an abnormal fluctuation value:
Figure BDA0002556618940000063
(c3) and clearing the data meeting the abnormal value judgment condition, marking the data as vacant data, and setting the data as null in the same way.
(c4) Let j equal j +1, continue sliding window backward until dj+m=dnUntil now.
Step (d) is further detailed below:
(d1) performing data completion on the data marked as null by using an adjacent mean interpolation method;
(d2) and storing all the gas volume data including the interpolation data and the normal data.
In order to verify the effectiveness of the method, in the embodiment, based on the gas quantity data of 2017-2019 in a certain area of Guizhou province, data of one sample and three samples are selected from the data, abnormal data in the data of each sample are obtained in advance by inquiring maintenance records and equipment logs, and then the method is adopted to detect and process the abnormal data.
Because the characteristics of the gas volume data in different regions are different, in this embodiment, an experiment is performed on the selection of a deviation value E and a window size m, where the deviation value E is taken from an initial value of 10% in increments of 10%, the window size m is taken from an initial value of 1 in increments of 1, table 1 shows the number of abnormal points and misjudgment correct samples detected by the value corresponding algorithm for different values of E, and table 2 shows the number of abnormal points and misjudgment correct samples detected by the value corresponding algorithm for different values of m (since m and E are independent from each other, the selection of E is not affected by the value of m):
TABLE 1
Figure BDA0002556618940000071
TABLE 2
Figure BDA0002556618940000072
As can be seen from table 1, when E is 30%, the anomaly detection performance of the algorithm is the best, so the preset deviation E in this embodiment is 30%;
as can be seen from table 2, since the anomaly detection performance of the algorithm is best when m is 3, the window size m in this embodiment is 3.
By establishing a window function model, the three sample data are tested, fig. 3a, fig. 4a and fig. 5a are schematic diagrams of actual abnormal data of the three sample data, fig. 3b, fig. 4b and fig. 5b are schematic diagrams of gas quantity data results after detection and processing by the method of the present invention, labeled abnormal data are in boxes in the diagrams, and some boxes may contain a plurality of abnormal data.

Claims (10)

1. A method for detecting and processing abnormal data of gas quantity is characterized by comprising the following steps:
s1, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out primary normal data and vacant data;
s2, performing data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data;
s3, constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data;
and S4, performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed.
2. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 1, wherein the step S1 specifically comprises the following steps:
s11, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out initial normal data and mark vacant data and data with a value smaller than 0, wherein the acquired gas quantity data set specifically comprises the following steps:
D=[d1,d2,…di,…dn]
wherein D is the acquired air volume data set, DiN is the total number of the acquired gas volume data;
and S12, deleting the data with the value less than 0 and marking the data as the vacant data.
3. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 2, wherein the step S2 specifically comprises the following steps:
s21, performing data interpolation on the blank data by using adjacent primary normal data before and after the blank data by using an adjacent mean interpolation method;
and S22, combining the data interpolated in the step S21 and all the initial normal data into a first air volume data set.
4. The method for detecting and processing abnormal gas quantity data according to claim 3, wherein the specific process of performing data interpolation on the missing data in the step S21 is as follows: for the a-th gas quantity data d marked as vacant dataaThe first primary normal data d appearing at the front position thereof is selected in order from right to lefta-qAnd selecting the first primary normal data d appearing at the rear position thereof in order from left to righta+qThen, the interpolated data value is obtained as:
Figure FDA0002556618930000021
wherein, CaSupplying gas quantity data d for interpolationaThe data value of (2).
5. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 1, wherein the step S3 specifically comprises the following steps:
s31, establishing a window function model by setting the size of windows, the number of windows, window function judgment conditions and a window sliding mode;
and S32, sequentially judging each data in the first air volume data set to be secondary normal data or abnormal fluctuation data according to a sliding mode and by combining a window function judgment condition.
6. The method for detecting and processing abnormal data of gas amount according to claim 5, wherein the number of the windows is n-m, where n is the total number of the acquired gas amount data, and m is the size of the window, i.e. the total number of the data in the window, the window is defined as:
Wj=[dj,dj+1,…dj+m-1]
j∈[1,n-m]
wherein, WjIs the jth window, djIs the jth gas volume data, i.e. the first data of the jth window, and so on, dj+1Second data for the jth window, dj+m-1The mth data of the jth window.
7. The method for detecting and processing the abnormal data of the quantity of the gas as claimed in claim 6, wherein the data in the window are all non-blank data.
8. The method for detecting and processing abnormal data of gas quantity according to claim 7, wherein the window function determination condition is specifically:
Figure FDA0002556618930000022
Figure FDA0002556618930000023
wherein, Wj_meanIs the average of all data in the jth window, dj+mAnd E is a preset fluctuation deviation, wherein the j + m gas quantity data is the first data outside the j window.
9. The method for detecting and processing abnormal data of gas amount according to claim 5, wherein the window sliding mode is that the window slides from the first data to the last data of the data set in sequence.
10. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 4, wherein the step S4 specifically comprises the following steps:
s41, deleting the abnormally fluctuated data, and marking the abnormally fluctuated data as vacant data;
s42, performing data interpolation on the blank data by using adjacent secondary normal data before and after the blank data by using an adjacent mean interpolation method;
and S43, combining the data interpolated in the step S42 and all secondary normal data into a second air volume data set.
CN202010593482.1A 2020-06-27 2020-06-27 Detection and processing method for abnormal data of gas quantity Pending CN111813766A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010593482.1A CN111813766A (en) 2020-06-27 2020-06-27 Detection and processing method for abnormal data of gas quantity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010593482.1A CN111813766A (en) 2020-06-27 2020-06-27 Detection and processing method for abnormal data of gas quantity

Publications (1)

Publication Number Publication Date
CN111813766A true CN111813766A (en) 2020-10-23

Family

ID=72855006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010593482.1A Pending CN111813766A (en) 2020-06-27 2020-06-27 Detection and processing method for abnormal data of gas quantity

Country Status (1)

Country Link
CN (1) CN111813766A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11809517B1 (en) * 2022-09-21 2023-11-07 Southwest Jiaotong University Adaptive method of cleaning structural health monitoring data based on local outlier factor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646167A (en) * 2013-11-22 2014-03-19 北京空间飞行器总体设计部 Satellite abnormal condition detection system based on telemeasuring data
CN107944464A (en) * 2017-10-12 2018-04-20 华南理工大学 A kind of office building by when energy consumption abnormal data online recognition and complementing method
CN109307811A (en) * 2018-08-06 2019-02-05 国网浙江省电力有限公司宁波供电公司 A kind of user's dedicated transformer electricity consumption monitoring method excavated based on big data
CN109727446A (en) * 2019-01-15 2019-05-07 华北电力大学(保定) A kind of identification and processing method of electricity consumption data exceptional value

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646167A (en) * 2013-11-22 2014-03-19 北京空间飞行器总体设计部 Satellite abnormal condition detection system based on telemeasuring data
CN107944464A (en) * 2017-10-12 2018-04-20 华南理工大学 A kind of office building by when energy consumption abnormal data online recognition and complementing method
CN109307811A (en) * 2018-08-06 2019-02-05 国网浙江省电力有限公司宁波供电公司 A kind of user's dedicated transformer electricity consumption monitoring method excavated based on big data
CN109727446A (en) * 2019-01-15 2019-05-07 华北电力大学(保定) A kind of identification and processing method of electricity consumption data exceptional value

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11809517B1 (en) * 2022-09-21 2023-11-07 Southwest Jiaotong University Adaptive method of cleaning structural health monitoring data based on local outlier factor

Similar Documents

Publication Publication Date Title
US8037533B2 (en) Detecting method for network intrusion
KR101799603B1 (en) Automatic fault detection and classification in a plasma processing system and methods thereof
CN104216350A (en) System and method for analyzing sensed data
CA2931624A1 (en) Systems and methods for event detection and diagnosis
WO2020062702A1 (en) Method and device for sending text messages, computer device and storage medium
US9865101B2 (en) Methods for detecting one or more aircraft anomalies and devices thereof
CN111916150A (en) Method and device for detecting genome copy number variation
WO2024036709A1 (en) Anomalous data detection method and apparatus
CN107229839B (en) Indel detection method based on next generation sequencing data
CN110766711A (en) Video shot segmentation method, system, device and storage medium
CN111813766A (en) Detection and processing method for abnormal data of gas quantity
CN115271003A (en) Abnormal data analysis method and system for automatic environment monitoring equipment
CN115908080B (en) Carbon emission optimization method and system based on multidimensional data analysis
CN113204914A (en) Flight data abnormity interpretation method based on multi-flight data characterization modeling
CN116357396A (en) Gas curve associated fluctuation anomaly identification method, device and related assembly
CN112000081A (en) Fault monitoring method and system based on multi-block information extraction and Mahalanobis distance
CN117171157B (en) Clearing data acquisition and cleaning method based on data analysis
CN114020811A (en) Data anomaly detection method and device and electronic equipment
CN115512189A (en) Image recognition model evaluation method, device and storage medium
CN112526558B (en) System operation condition identification and cutting method under partial data loss condition
KR20230153080A (en) System for detecting hierarchical network intrusion using hidden layer information of autoencoder and method thereof
CN114185785A (en) Natural language processing model test case reduction method for deep neural network
US6819788B2 (en) Failure analysis method that allows high-precision failure mode classification
CN112801367A (en) Fault prediction method based on ARMret model considering rare variables
CN113299342A (en) Copy number variation detection method and device based on chip data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination