CN111813766A - Detection and processing method for abnormal data of gas quantity - Google Patents
Detection and processing method for abnormal data of gas quantity Download PDFInfo
- Publication number
- CN111813766A CN111813766A CN202010593482.1A CN202010593482A CN111813766A CN 111813766 A CN111813766 A CN 111813766A CN 202010593482 A CN202010593482 A CN 202010593482A CN 111813766 A CN111813766 A CN 111813766A
- Authority
- CN
- China
- Prior art keywords
- data
- gas
- abnormal
- window
- gas quantity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Examining Or Testing Airtightness (AREA)
Abstract
The invention relates to a method for detecting and processing abnormal data of gas quantity, which comprises the following steps: acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set so as to screen out primary normal data and vacant data; carrying out data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data; constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data; and performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed. Compared with the prior art, the method can accurately and reliably detect the abnormal data through two screening, marking and data interpolation operations, and can keep the integrity and continuity of the whole gas volume data.
Description
Technical Field
The invention relates to the technical field of gas quantity statistics, in particular to a method for detecting and processing abnormal gas quantity data.
Background
The gas volume data is statistics of gas consumption of each user by a gas enterprise, is a basis for user behavior analysis and user gas consumption prediction, and due to the fact that the gas consumption is increased year by year due to the fact that the natural gas is used more and more nowadays, more and more problems are shown in the statistics of the gas volume data of the gas enterprise, wherein the most main problems are as follows:
1. in the statistical process, the recorded gas volume data has a large deviation from the gas volume data under the actual condition due to factors such as maintenance, equipment failure, policy adjustment and equipment updating;
2. the fluctuation of the gas quantity data is large, and the distribution of the gas quantity data is irregular due to irregular gas consumption of part of users, so that the detection difficulty of the abnormal value of the gas quantity data is large;
3. the related methods for processing the abnormal values of the current gas quantity data are fewer, and the attention of enterprises to the processing of the abnormal values of the gas quantity data is lower.
Most gas enterprises use a traditional analysis method for processing abnormal values of gas volume data, and use a traditional descriptive statistical analysis method or a box diagram analysis method and other general methods for detecting abnormal data, wherein the methods can detect partial abnormal data, but have the cost of removing a large amount of normal data, and have no detection for abnormal data with weak regularity and small deviation with the overall data; because a large amount of normal data are removed and partial abnormal data exist, the whole gas quantity data is lost, so that great influence is generated on the gas quantity prediction of the user and the behavior analysis of the user, and the accuracy of the prediction and analysis is reduced.
Disclosure of Invention
The present invention aims to overcome the above-mentioned drawbacks of the prior art and provide a method for detecting and processing abnormal gas quantity data, so as to accurately detect the abnormal data and correspondingly process the abnormal data, thereby ensuring the integrity and continuity of the gas quantity data.
The purpose of the invention can be realized by the following technical scheme: a method for detecting and processing abnormal data of gas quantity comprises the following steps:
s1, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out primary normal data and vacant data;
s2, performing data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data;
s3, constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data;
and S4, performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed.
Further, the step S1 specifically includes the following steps:
s11, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out initial normal data and mark vacant data and data with a value smaller than 0, wherein the acquired gas quantity data set specifically comprises the following steps:
D=[d1,d2,…di,…dn]
wherein D is the acquired air volume data set, DiN is the total number of the acquired gas volume data;
and S12, deleting the data with the value less than 0 and marking the data as the vacant data.
Further, the step S2 specifically includes the following steps:
s21, performing data interpolation on the blank data by using adjacent primary normal data before and after the blank data by using an adjacent mean interpolation method;
and S22, combining the data interpolated in the step S21 and all the initial normal data into a first air volume data set.
Further, the specific process of performing data interpolation on the blank data in step S21 is as follows: for the a-th gas quantity data d marked as vacant dataaThe first primary normal data d appearing at the front position thereof is selected in order from right to lefta-qAnd selecting the first primary normal data d appearing at the rear position thereof in order from left to righta+qThen, the interpolated data value is obtained as:
wherein, CaSupplying gas quantity data d for interpolationaThe data value of (2).
Further, the step S3 specifically includes the following steps:
s31, establishing a window function model by setting the size of windows, the number of windows, window function judgment conditions and a window sliding mode;
and S32, sequentially judging each data in the first air volume data set to be secondary normal data or abnormal fluctuation data according to a sliding mode and by combining a window function judgment condition.
Further, the number of the windows is n-m, where n is the total number of the gas volume data, and m is the size of the window, that is, the total number of the data in the window, the window may be defined as:
Wj=[dj,dj+1,…dj+m-1]
j∈[1,n-m]
wherein, WjIs the jth window, djIs the jth gas volume data, i.e. the first data of the jth window, and so on, dj+1Second data for the jth window, dj+m-1The mth data of the jth window.
Further, the data in the window are all non-null data.
Further, the window function determination condition is specifically:
wherein, Wj_meanIs the average of all data in the jth window, dj+mAnd E is a preset fluctuation deviation, wherein the j + m gas quantity data is the first data outside the j window.
Further, the window sliding manner is specifically that the window sequentially slides from the first data to the last data of the data set.
Further, the step S4 specifically includes the following steps:
s41, deleting the abnormally fluctuated data, and marking the abnormally fluctuated data as vacant data;
s42, performing data interpolation on the blank data by using adjacent secondary normal data before and after the blank data by using an adjacent mean interpolation method;
and S43, combining the data interpolated in the step S42 and all secondary normal data into a second air volume data set.
Compared with the prior art, the invention has the following advantages:
the method disclosed by the invention has the advantages that the window function idea is utilized, the gas volume data set is segmented to realize independent judgment of a single subdata segment, and compared with an abnormal detection mode of the whole gas volume data, the method improves the accuracy of abnormal detection; meanwhile, the method can deal with irregular air volume data, and the adaptability is obviously enhanced.
Aiming at the characteristics of the gas quantity data, firstly, an integrity detection method is used for carrying out primary screening on vacancy and data with the numerical value smaller than 0 and marking all the data as vacancy data, then, a window function method is used for carrying out secondary screening on abnormal fluctuation data and marking the abnormal fluctuation data as vacancy data again, the accuracy of abnormal value detection can be guaranteed through multiple screening, meanwhile, the misjudgment probability of normal data is reduced, and the reliability of abnormal data detection is improved.
And thirdly, performing data interpolation processing on the gas volume data marked as the vacant data by adopting an adjacent mean interpolation method, thereby ensuring the integrity and the continuity of the whole gas volume data set, and being beneficial to accurately predicting the gas volume of the user and analyzing the user behavior in the follow-up process.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic diagram of an embodiment of an application process;
FIG. 3a is a schematic diagram of actual abnormal data of sample-gas volume data in the embodiment;
FIG. 3b is a schematic diagram illustrating the results of detecting and processing the gas quantity data of the sample according to the method of the present invention in the embodiment;
FIG. 4a is a schematic diagram of actual abnormal data of sample two-gas volume data in the embodiment;
FIG. 4b is a schematic diagram illustrating the results of detecting and processing the gas quantity data of the second sample by the method of the present invention in the embodiment;
FIG. 5a is a schematic diagram of actual abnormal data of sample three-gas volume data in the embodiment;
FIG. 5b is a schematic diagram illustrating the results of detecting and processing the gas quantity data of the second sample by the method of the present invention in the embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
Examples
As shown in fig. 1, a method for detecting and processing abnormal gas quantity data includes the following steps:
s1, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out primary normal data and vacant data;
s2, performing data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data;
s3, constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data;
and S4, performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed.
The method of the invention is applied to practice, and the specific process is shown in figure 2:
(a) carrying out integrity detection on input gas quantity data;
(b) establishing a window function model;
(c) abnormal value detection is carried out on the input gas quantity data;
(d) processing abnormal values;
step (a) is further detailed below:
(a1) carrying out data integrity detection on input data, and marking vacant data;
(a2) carrying out data integrity detection on input data, and marking data smaller than 0;
(a3) deleting the data smaller than 0 and marking the data as vacant data;
(a4) performing adjacent mean interpolation on the vacant data;
(a5) and storing all the gas volume data including the interpolation data and the normal data.
Step (a1) is further detailed below:
(a11) the input gas volume data is set D, the data set number is n, D ═ Di,di,…dn];
(a12) Detecting the vacant data and marking as null;
step (a4) is further detailed below:
(a41) marking missing data as nullaData of adjacent left and right sides is da-1,da+1;
(a42) According to the formula:
wherein, CaTo interpolate to nullaSo as to update nullaIf d isa-1If the data is marked as vacant, continue to fetch d to the lefta-2Until normal data appears, if da+1Marked as vacant data, then continue to fetch d to the righta+2Until normal data is present.
Step (b) is further detailed below:
(b1) setting the size m of the window, namely setting the total number of data in the window as m;
(b2) setting a judgment condition of the window function;
(b3) the sliding mode of the window function is set as follows: sliding from the head end to the tail end of the data in sequence;
(b4) the window function is saved.
Step (b2) is further detailed below:
(b21) setting window WjThe number of windows is n-m, the j (j belongs to [1, n-m ]]) The individual windows are defined as: wj=[dj,dj+1,…dj+m-1]According to the formula:
finding the Window WjMean value W of all data injIf W isjIf null exists in the data, deleting null, and taking a non-null value to the left to ensure WjThe number of data in the data is m, and the data has no null value;
(b22) setting a decision formula in a window function, calculating a window and next data d outside the windowj+mDeviation of (2):
step (c) is further detailed below:
(c1) setting a value of a preset deviation E (the deviation values of different gas quantity data are different);
(c2) sequentially judging the data d outside the window from j to 1 according to the judgment condition of the window functionj+mWhether it is an abnormal fluctuation value:
(c3) and clearing the data meeting the abnormal value judgment condition, marking the data as vacant data, and setting the data as null in the same way.
(c4) Let j equal j +1, continue sliding window backward until dj+m=dnUntil now.
Step (d) is further detailed below:
(d1) performing data completion on the data marked as null by using an adjacent mean interpolation method;
(d2) and storing all the gas volume data including the interpolation data and the normal data.
In order to verify the effectiveness of the method, in the embodiment, based on the gas quantity data of 2017-2019 in a certain area of Guizhou province, data of one sample and three samples are selected from the data, abnormal data in the data of each sample are obtained in advance by inquiring maintenance records and equipment logs, and then the method is adopted to detect and process the abnormal data.
Because the characteristics of the gas volume data in different regions are different, in this embodiment, an experiment is performed on the selection of a deviation value E and a window size m, where the deviation value E is taken from an initial value of 10% in increments of 10%, the window size m is taken from an initial value of 1 in increments of 1, table 1 shows the number of abnormal points and misjudgment correct samples detected by the value corresponding algorithm for different values of E, and table 2 shows the number of abnormal points and misjudgment correct samples detected by the value corresponding algorithm for different values of m (since m and E are independent from each other, the selection of E is not affected by the value of m):
TABLE 1
TABLE 2
As can be seen from table 1, when E is 30%, the anomaly detection performance of the algorithm is the best, so the preset deviation E in this embodiment is 30%;
as can be seen from table 2, since the anomaly detection performance of the algorithm is best when m is 3, the window size m in this embodiment is 3.
By establishing a window function model, the three sample data are tested, fig. 3a, fig. 4a and fig. 5a are schematic diagrams of actual abnormal data of the three sample data, fig. 3b, fig. 4b and fig. 5b are schematic diagrams of gas quantity data results after detection and processing by the method of the present invention, labeled abnormal data are in boxes in the diagrams, and some boxes may contain a plurality of abnormal data.
Claims (10)
1. A method for detecting and processing abnormal data of gas quantity is characterized by comprising the following steps:
s1, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out primary normal data and vacant data;
s2, performing data interpolation on the blank data, and then forming a first air volume data set together with the primary normal data;
s3, constructing a window function, and performing sliding operation on the first air volume data set by using the window function to screen out secondary normal data and abnormal fluctuation data;
and S4, performing data interpolation on the abnormal fluctuation data, and then forming a second gas volume data set together with the secondary normal data to obtain the gas volume data after the abnormal data is detected and processed.
2. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 1, wherein the step S1 specifically comprises the following steps:
s11, acquiring a gas quantity data set, and carrying out integrity detection on the gas quantity data set to screen out initial normal data and mark vacant data and data with a value smaller than 0, wherein the acquired gas quantity data set specifically comprises the following steps:
D=[d1,d2,…di,…dn]
wherein D is the acquired air volume data set, DiN is the total number of the acquired gas volume data;
and S12, deleting the data with the value less than 0 and marking the data as the vacant data.
3. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 2, wherein the step S2 specifically comprises the following steps:
s21, performing data interpolation on the blank data by using adjacent primary normal data before and after the blank data by using an adjacent mean interpolation method;
and S22, combining the data interpolated in the step S21 and all the initial normal data into a first air volume data set.
4. The method for detecting and processing abnormal gas quantity data according to claim 3, wherein the specific process of performing data interpolation on the missing data in the step S21 is as follows: for the a-th gas quantity data d marked as vacant dataaThe first primary normal data d appearing at the front position thereof is selected in order from right to lefta-qAnd selecting the first primary normal data d appearing at the rear position thereof in order from left to righta+qThen, the interpolated data value is obtained as:
wherein, CaSupplying gas quantity data d for interpolationaThe data value of (2).
5. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 1, wherein the step S3 specifically comprises the following steps:
s31, establishing a window function model by setting the size of windows, the number of windows, window function judgment conditions and a window sliding mode;
and S32, sequentially judging each data in the first air volume data set to be secondary normal data or abnormal fluctuation data according to a sliding mode and by combining a window function judgment condition.
6. The method for detecting and processing abnormal data of gas amount according to claim 5, wherein the number of the windows is n-m, where n is the total number of the acquired gas amount data, and m is the size of the window, i.e. the total number of the data in the window, the window is defined as:
Wj=[dj,dj+1,…dj+m-1]
j∈[1,n-m]
wherein, WjIs the jth window, djIs the jth gas volume data, i.e. the first data of the jth window, and so on, dj+1Second data for the jth window, dj+m-1The mth data of the jth window.
7. The method for detecting and processing the abnormal data of the quantity of the gas as claimed in claim 6, wherein the data in the window are all non-blank data.
8. The method for detecting and processing abnormal data of gas quantity according to claim 7, wherein the window function determination condition is specifically:
wherein, Wj_meanIs the average of all data in the jth window, dj+mAnd E is a preset fluctuation deviation, wherein the j + m gas quantity data is the first data outside the j window.
9. The method for detecting and processing abnormal data of gas amount according to claim 5, wherein the window sliding mode is that the window slides from the first data to the last data of the data set in sequence.
10. The method for detecting and processing the abnormal data of the amount of gas as claimed in claim 4, wherein the step S4 specifically comprises the following steps:
s41, deleting the abnormally fluctuated data, and marking the abnormally fluctuated data as vacant data;
s42, performing data interpolation on the blank data by using adjacent secondary normal data before and after the blank data by using an adjacent mean interpolation method;
and S43, combining the data interpolated in the step S42 and all secondary normal data into a second air volume data set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010593482.1A CN111813766A (en) | 2020-06-27 | 2020-06-27 | Detection and processing method for abnormal data of gas quantity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010593482.1A CN111813766A (en) | 2020-06-27 | 2020-06-27 | Detection and processing method for abnormal data of gas quantity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111813766A true CN111813766A (en) | 2020-10-23 |
Family
ID=72855006
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010593482.1A Pending CN111813766A (en) | 2020-06-27 | 2020-06-27 | Detection and processing method for abnormal data of gas quantity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111813766A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11809517B1 (en) * | 2022-09-21 | 2023-11-07 | Southwest Jiaotong University | Adaptive method of cleaning structural health monitoring data based on local outlier factor |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646167A (en) * | 2013-11-22 | 2014-03-19 | 北京空间飞行器总体设计部 | Satellite abnormal condition detection system based on telemeasuring data |
CN107944464A (en) * | 2017-10-12 | 2018-04-20 | 华南理工大学 | A kind of office building by when energy consumption abnormal data online recognition and complementing method |
CN109307811A (en) * | 2018-08-06 | 2019-02-05 | 国网浙江省电力有限公司宁波供电公司 | A kind of user's dedicated transformer electricity consumption monitoring method excavated based on big data |
CN109727446A (en) * | 2019-01-15 | 2019-05-07 | 华北电力大学(保定) | A kind of identification and processing method of electricity consumption data exceptional value |
-
2020
- 2020-06-27 CN CN202010593482.1A patent/CN111813766A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646167A (en) * | 2013-11-22 | 2014-03-19 | 北京空间飞行器总体设计部 | Satellite abnormal condition detection system based on telemeasuring data |
CN107944464A (en) * | 2017-10-12 | 2018-04-20 | 华南理工大学 | A kind of office building by when energy consumption abnormal data online recognition and complementing method |
CN109307811A (en) * | 2018-08-06 | 2019-02-05 | 国网浙江省电力有限公司宁波供电公司 | A kind of user's dedicated transformer electricity consumption monitoring method excavated based on big data |
CN109727446A (en) * | 2019-01-15 | 2019-05-07 | 华北电力大学(保定) | A kind of identification and processing method of electricity consumption data exceptional value |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11809517B1 (en) * | 2022-09-21 | 2023-11-07 | Southwest Jiaotong University | Adaptive method of cleaning structural health monitoring data based on local outlier factor |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8037533B2 (en) | Detecting method for network intrusion | |
KR101799603B1 (en) | Automatic fault detection and classification in a plasma processing system and methods thereof | |
CN104216350A (en) | System and method for analyzing sensed data | |
CA2931624A1 (en) | Systems and methods for event detection and diagnosis | |
WO2020062702A1 (en) | Method and device for sending text messages, computer device and storage medium | |
US9865101B2 (en) | Methods for detecting one or more aircraft anomalies and devices thereof | |
CN111916150A (en) | Method and device for detecting genome copy number variation | |
WO2024036709A1 (en) | Anomalous data detection method and apparatus | |
CN107229839B (en) | Indel detection method based on next generation sequencing data | |
CN110766711A (en) | Video shot segmentation method, system, device and storage medium | |
CN111813766A (en) | Detection and processing method for abnormal data of gas quantity | |
CN115271003A (en) | Abnormal data analysis method and system for automatic environment monitoring equipment | |
CN115908080B (en) | Carbon emission optimization method and system based on multidimensional data analysis | |
CN113204914A (en) | Flight data abnormity interpretation method based on multi-flight data characterization modeling | |
CN116357396A (en) | Gas curve associated fluctuation anomaly identification method, device and related assembly | |
CN112000081A (en) | Fault monitoring method and system based on multi-block information extraction and Mahalanobis distance | |
CN117171157B (en) | Clearing data acquisition and cleaning method based on data analysis | |
CN114020811A (en) | Data anomaly detection method and device and electronic equipment | |
CN115512189A (en) | Image recognition model evaluation method, device and storage medium | |
CN112526558B (en) | System operation condition identification and cutting method under partial data loss condition | |
KR20230153080A (en) | System for detecting hierarchical network intrusion using hidden layer information of autoencoder and method thereof | |
CN114185785A (en) | Natural language processing model test case reduction method for deep neural network | |
US6819788B2 (en) | Failure analysis method that allows high-precision failure mode classification | |
CN112801367A (en) | Fault prediction method based on ARMret model considering rare variables | |
CN113299342A (en) | Copy number variation detection method and device based on chip data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |