CN116150239B - Data mining method for gas stealing behavior - Google Patents

Data mining method for gas stealing behavior Download PDF

Info

Publication number
CN116150239B
CN116150239B CN202211621446.7A CN202211621446A CN116150239B CN 116150239 B CN116150239 B CN 116150239B CN 202211621446 A CN202211621446 A CN 202211621446A CN 116150239 B CN116150239 B CN 116150239B
Authority
CN
China
Prior art keywords
user
day
gas
abnormal
gas consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211621446.7A
Other languages
Chinese (zh)
Other versions
CN116150239A (en
Inventor
张正有
花磊
阳博
周航
吴叶
鲜忠亚
范忠義
孙瑞
周光君
徐佳辉
周业沛
王宇
郭俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pengzhou China Resources Gas Co ltd
Original Assignee
Pengzhou China Resources Gas Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pengzhou China Resources Gas Co ltd filed Critical Pengzhou China Resources Gas Co ltd
Priority to CN202211621446.7A priority Critical patent/CN116150239B/en
Publication of CN116150239A publication Critical patent/CN116150239A/en
Application granted granted Critical
Publication of CN116150239B publication Critical patent/CN116150239B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a data mining method for gas stealing behavior, which comprises the following steps: acquiring air consumption data of a user, analyzing the air consumption data, and judging whether the user is an abnormal user or not; the judging of the abnormal user includes: negative abnormal users, zero abnormal users, small air consumption users and inconsistent recent rules. The invention classifies the abnormal gas data into four types, performs mining analysis on the gas consumption data of the gas users, positions the users with abnormal gas, automatically identifies and reduces the range of the abnormal users, and is convenient for staff to check the abnormal on site, thereby reducing the workload of staff to check.

Description

Data mining method for gas stealing behavior
Technical Field
The invention relates to the technical field of data mining processing, in particular to a data mining method aiming at gas stealing behavior.
Background
For a user with normal gas usage rules, the gas usage curve is constructed to be substantially monotonous and linear by taking the historical accumulated gas amount of about one year as an ordinate and the gas usage time as an abscissa.
However, for the gas consumption of the user with the gas stealing action, on one hand, the gas stealing action is various, and common means comprise a callback self-wheel, bypass pipeline bypass metering equipment, and damage metering device, but no matter what gas stealing means are adopted, the gas consumption data are in four conditions of negative gas consumption abnormality, zero gas consumption abnormality, small gas consumption abnormality and abnormality which do not accord with the gas consumption rule of the user. By classifying the characteristics of the gas consumption data, a targeted data mining model is extracted, and the gas consumption data processing method is beneficial to locating abnormal users.
On the other hand, only the gas consumption of the user is used to infer whether the gas consumption is stolen or not, which is a problem of multiple solutions in mathematics, that is, the gas consumption curve is necessarily changed by the gas consumption, but the gas consumption curve is not necessarily changed by the gas consumption, and the gas consumption curve may also be related to the gas consumption mode conversion of the user, such as the reduction of the passenger flow, seasonal factors, poor operation and the like caused by road construction. Under the condition of only gas consumption data, further mining analysis is needed to locate the abnormal users of the gas, so that staff can conveniently check the abnormality on site.
Disclosure of Invention
The invention aims to perform mining analysis on gas consumption data of a gas user, locate a user with abnormal gas, automatically identify and reduce the range of the abnormal user, so that a worker can conveniently check the abnormal condition on site, the workload of the worker is reduced, and the data mining method for gas stealing behavior is provided.
In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:
a data mining method for gas stealing behavior comprises the following steps:
acquiring air consumption data of a user, analyzing the air consumption data, and judging whether the user is an abnormal user or not;
the judging of the abnormal user includes:
negative anomaly user: the gas consumption data of the users are formed into a sequence taking the day as a unit, and if the number of the non-monotonic points in the sequence is larger than a set threshold value, the abnormal users are judged;
zero anomaly user: screening out the number of days of continuous zero gas consumption from the gas consumption data of the users, calculating the cumulative probability of poisson distribution of the users, and judging the users as abnormal if the cumulative probability is larger than a set probability threshold;
small air consumption user: carrying out fractal dimension processing on the gas consumption data of the user according to the number of days, counting fractal distance sequences of all days, and calculating the skewness and loss of the fractal distance sequences so as to judge whether the user is an abnormal user;
recent rule inconsistent users: and constructing a gray scale graph according to the gas consumption data of the users in hours/days, and searching abnormal users from the gray scale graph.
Further, the specific judging step of the negative abnormal user includes:
taking the gas consumption data of the user as a unit, and acquiring the data of the connection of the first reading and the last reading of each day to form a sequence, wherein the total number of the sequences is n; arranging n sequences in time sequence, counting the number m of the non-monotonic points in the n sequences, and calculating the proportion P=m/n of the non-monotonic points;
setting a threshold T1, and if P is less than T1, listing the gas equipment of the user into equipment to be inspected;
setting threshold values T2 and T3, calculating the negative abnormal size G of equipment to be checked, and judging a first-level abnormal user if G is less than T2; if T2 is not less than G and not more than T3, judging that the user is a second-level abnormal user; if T3 is less than G, judging that the user is a three-level abnormal user.
Further, the step of arranging n sequences in time sequence and counting the number m of non-monotonic points in the n sequences includes:
extracting a first reading x1 and a last reading x2 of the same day, and extracting a minimum reading y1 and a maximum reading y2 of the same day;
if the minimum reading y1 is smaller than the first reading x1 or the maximum reading y2 is larger than the last reading x2, judging that the sequence on the current day is a non-monotonic point;
or the last reading x2 of the day is smaller than the first reading x1, and the sequence of the day is also judged to be a non-monotonic point.
Further, the specific judging step of the zero anomaly user includes:
screening the number of days of continuous zero gas consumption from the gas consumption data of the users, taking the starting time to the ending time of the zero gas consumption as a section, and counting the number of sections of the zero gas consumption of each user gas equipment;
establishing a poisson distribution model by using the zero gas consumption segment numbers of all the user gas equipment, calculating the cumulative probability of the poisson distribution model, and setting a probability threshold valueIf the cumulative probability is greater than the probability threshold +.>And (3) judging the user as an abnormal user.
Further, the specific judging step of the small air consumption user comprises the following steps:
counting the gas consumption data of the users according to the date and the accumulated gas consumption, starting with the starting date, converting the data into the number of days and the accumulated gas consumption, and calculating a correlation coefficient:
wherein X is convertedIs used for the number of days of (a),is the average value of X, Y is the accumulated air consumption, ">Average Y, n is total days, i is the i-th day;
setting a coefficient thresholdFor less than coefficient threshold->Fractal dimension processing is carried out on the air consumption data of the user to obtain an air consumption curve chart of n days, wherein the abscissa of the air consumption curve chart is the number of days, and the ordinate of the air consumption curve chart is the accumulated air consumption;
calculating the distance d between day 1 and day 2 11 Distance d between day 2 and day 3 12 And analogizing to obtain d 1 =d 11 +d 12 +...;
Calculating the distance d between day 1 and day 3 21 Distance d between day 3 and day 5 22 And analogizing to obtain d 2 =d 21 +d 22 +...;
Calculating the distance d between day 1 and day 4 31 Distance d between day 4 and day 7 32 And analogizing to obtain d 3 =d 31 +d 32 +...;
Until the distance between the 1 st day and the n th day is calculated to obtain d n-1
Calculate the sequence d= { d 1 ,d 2 ,...,d n-1 Skewness Skew:
wherein ,for the average value of the sequence d, +.>Standard deviation of sequence d, n is total number of days;
calculate the Loss of sequence d:
wherein, max (d) is the maximum value in the sequence d, min (d) is the minimum value in the sequence;
and setting a bias threshold and a Loss threshold, and judging that the air consumption is small if the calculated bias Shew is larger than the bias threshold or the calculated Loss is larger than the Loss threshold.
Further, the specific judging step of the user inconsistent with the recent rule includes:
acquiring gas consumption data of a user per hour by taking a day as a unit, constructing a two-dimensional matrix Mat (i, j) for the gas consumption data per hour, wherein the gas consumption data of the j-th hour on the i-th day is represented, the abscissa of the two-dimensional matrix is the hour, the ordinate is the date, and each element of the two-dimensional matrix represents the gas consumption of one hour;
normalizing the two-dimensional matrix Mat (i, j), and quantizing the matrix Mat (i, j) to an integer between 0 and 255 as a Gray image Gray (i, j):
counting the number h (t) of pixels in each Gray level t in the Gray image Gray (i, j);
the number of occurrences of the gray level t is converted into a probability distribution p (t):
p (t) =h (t)/(image width) image height
Calculating a gray level cumulative probability distribution hp (t):
wherein t represents a gray level, t e [0,255], and k represents a kth gray level;
contrast of the Gray image Gray (i, j) is enhanced:
Gray(i,j) = 255 * hp(Gray(i,j))
k-means clustering is carried out on Gray images Gray (i, j) in a row mode, the classification number is set to be 2, and the Euclidean distance of data in two days is as follows:
wherein s1 and s2 represent any two days;
and 2 clustering centers C1 and C2 are obtained according to k-means clustering, a long-term range and a near-term range are set, whether the C1 and the C2 fall in the long-term range and the near-term range respectively is judged, and if the C1 and the C2 fall in the long-term range and the near-term range respectively, the user with inconsistent near-term rules is judged.
Compared with the prior art, the invention has the beneficial effects that:
the invention classifies the abnormal gas data into four types, performs mining analysis on the gas consumption data of the gas users, positions the users with abnormal gas, automatically identifies and reduces the range of the abnormal users, and is convenient for staff to check the abnormal on site, thereby reducing the workload of staff to check.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a graph of gas consumption for an example negative anomaly user;
FIG. 2 is a non-monotonic point visualization of an embodiment negative anomaly user;
FIG. 3 is a gas column diagram for an embodiment zero anomaly user;
FIG. 4 is a poisson distribution diagram of gas consumption for an example zero anomaly user;
FIG. 5 is a graph of gas usage for an example small gas usage user;
FIG. 6 is a gray scale image of the air consumption of a user with inconsistent recent regularity according to the embodiment.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Also, in the description of the present invention, the terms "first," "second," and the like are used merely to distinguish one from another, and are not to be construed as indicating or implying a relative importance or implying any actual such relationship or order between such entities or operations. In addition, the terms "connected," "coupled," and the like may be used to denote a direct connection between elements, or an indirect connection via other elements.
The invention is realized by the following technical scheme, namely, the data mining method aiming at gas stealing behavior of the gas is realized, the gas consumption data of a user is obtained, the gas consumption data is analyzed, and whether the user is an abnormal user is judged. The conditions of the abnormal users are judged to comprise four kinds of negative abnormal users, zero abnormal users, small air consumption users and users with inconsistent recent rules.
(one) negative anomaly user
Negative abnormal users, namely the phenomenon of negative increase of gas consumption data, can be the gas stealing behavior of the user callback self-wheel, and can also be the abnormal reading or engineering behavior of the gas equipment, so the phenomenon has low multiple solutions. Referring to fig. 1, for negative anomalies caused by gas stealing actions of a user, the gas consumption data curve has two characteristics: one is a column with red color appearing on the graph (i.e., the portion outlined by the dashed line in fig. 1); and secondly, the data at the two ends of the negative abnormal data are both normal gas consumption data.
When judging a negative abnormal user, firstly taking the gas consumption data of the user as a unit of "day", acquiring the data of the first reading and the last reading of each day to form a sequence, for example, acquiring the gas consumption data of 1 month, 1 day and 12 months, 31 days of the user, and the data of the first reading and the last reading of 1 month, 1 day to form a sequence 1, wherein the total number of the sequences is 365 (365 days in total is assumed to be 1 month, 1 day and 12 months, 31 days). Counting non-monotonic points in the sequences 1-365, extracting a first reading x1 and a last reading x2 of the same day, and extracting a minimum reading y1 and a maximum reading y2 of the same day; if the minimum reading y1 is smaller than the first reading x1 or the maximum reading y2 is larger than the last reading x2, judging that the sequence on the current day is a non-monotonic point; or the last reading x2 of the day is smaller than the first reading x1, and the sequence of the day is also judged to be a non-monotonic point. For example, if the first reading x1 of the 0 th point of the day is=10, and the last reading x2 of the 24 th point is smaller than 10, the sequence of the day is determined to be a non-monotonic point; for another example, if the first reading x1 = 10 at the 0 th point, the last reading x2 = 20 at the 24 th point, the minimum reading y1 is less than x1, or the maximum reading y2 is greater than x2, then the sequence is also determined to be a non-monotonic point. Turning to the form of a visual chart, please refer to fig. 2 for the gas consumption data from 0 point to 24 points on the same day, when the first reading x1 corresponding to 0 point is smaller than the last reading x2 corresponding to 24 points, the point a represents the smallest reading y1 on the same day, the point B represents the largest reading y2 on the same day, and if the column has two upper or/and lower lines, it is indicated that y1< x1 or/and y2> x2 is a non-monotonic point. So that the non-monotonic points can be quickly detected from the visual map.
Counting the total number of the non-monotonic points as m, calculating the proportion P=m/n of the non-monotonic points, setting a threshold T1, and if P < T1, listing the gas equipment of the user into equipment to be checked; if P > T1, this may be an abnormal cause of the gas plant itself, because the number of non-monotonic points is too large, and then it is also necessary for a worker to go to the site to check the gas plant.
Setting threshold values T2 and T3, and calculating the negative abnormal size G of the equipment to be inspected, wherein the negative abnormal size G is the height of the red column in FIG. 1. If G is less than T2, judging that the user is a first-level abnormal user; if T2 is not less than G and not more than T3, judging that the user is a second-level abnormal user; if T3 is less than G, judging that the user is a three-level abnormal user.
(II) zero anomaly user
The phenomenon of zero gas consumption has strong multiple solutions, and can be that a user bypasses a pipeline to bypass the metering equipment to steal gas or that the user is not in a gas consumption state in practice. Referring to fig. 3, there are two features for zero anomaly caused by gas theft: firstly, a column without height appears on a K line graph; and secondly, the readings at the two ends of the zero gas are close. It should be explained that, the K-line graph is a bar graph with the abscissa as the date and the ordinate as the gas consumption, and the longer the gas is used, the thicker the corresponding column.
When judging whether the user is zero gas consumption, firstly judging the data reliability of the gas equipment of the user, and if the proportion of the non-monotonic points is larger than the set threshold value P, not considering that the non-monotonic points are abnormal. And then screening out the number of days of zero gas consumption in the statistical time period, taking the starting time to the ending time of continuous zero gas consumption as one section for each gas equipment, counting the number of sections of the zero gas consumption, and removing the number of sections of the zero gas consumption containing holidays. Establishing a poisson distribution model for the zero gas consumption segment numbers of all gas equipment, calculating the cumulative probability of the poisson distribution model, referring to fig. 4, and setting a probability threshold valueIf the cumulative probability is greater than the probability threshold +.>Is used by the user of the (c) in the (c) system,and judging the abnormal user of the zero air consumption.
(III) Small gas consumption user
The gas consumption phenomenon is very high in multiple solutions, for example, the gas consumption mode of a user can be actually converted when the user connects a pipeline to split gas and destroy metering equipment and the like to steal gas. Because of strong multi-resolution, the following strategies are adopted when the judgment cannot be accurately carried out: firstly, assuming that the gas stealing behavior exists in the users certainly, and then positioning the users with complicated gas consumption data curve forms and severe changes according to time sequence.
Statistics of the air consumption data of the users according to the date and the format of the accumulated air consumption, such as (2020-01, 10), (2020-01-02, 25), (2020-01-03,20) and the like; starting with the start date, the format is converted into days and cumulative gas usage, such as (1, 10), (2, 25), (3, 20), etc. Calculating a correlation coefficient:
wherein X is the number of days after conversion,is the average value of X, Y is the accumulated air consumption, ">The average value of Y, n is the total number of days, and i is the i-th day.
Setting a coefficient thresholdFor less than coefficient threshold->Fractal dimension processing is carried out on the air consumption data of the user to obtain an air consumption curve chart of n days, such as the curve shown in fig. 5, wherein the abscissa of the air consumption curve chart is the number of days, and the ordinate is the accumulated air consumption.
Calculating the distance d between day 1 and day 2 11 Distance d between day 2 and day 3 12 In turnAnalogize to d 1 =d 11 +d 12 +...;
Calculating the distance d between day 1 and day 3 21 Distance d between day 3 and day 5 22 And analogizing to obtain d 2 =d 21 +d 22 +...;
Calculating the distance d between day 1 and day 4 31 Distance d between day 4 and day 7 32 And analogizing to obtain d 3 =d 31 +d 32 +...;
Until the distance between the 1 st day and the n th day is calculated to obtain d n-1
Calculate the sequence d= { d 1 ,d 2 ,...,d n-1 Skewness Skew:
wherein ,for the average value of the sequence d, +.>Standard deviation of sequence d, n is total number of days;
calculate the Loss of sequence d:
wherein, max (d) is the maximum value in the sequence d, min (d) is the minimum value in the sequence;
and setting a bias threshold and a Loss threshold, and judging that the air consumption is small if the calculated bias Shew is larger than the bias threshold or the calculated Loss is larger than the Loss threshold.
(IV) users with inconsistent recent laws
And acquiring gas consumption data of each hour by taking a day as a unit, and constructing a two-dimensional matrix for the gas consumption data of each hour, wherein the abscissa of the two-dimensional matrix is the hour, the ordinate of the two-dimensional matrix is the date, and each element of the two-dimensional matrix represents the gas consumption of one hour. For example, a 90 day statistical period corresponds to a 90 x 24 matrix, and Mat (i, j) represents the gas usage at the j-th hour on the i-th day.
Normalizing the two-dimensional matrix Mat (i, j), and quantizing the matrix Mat (i, j) to an integer between 0 and 255 as a Gray image Gray (i, j):
the number of pixel points h (t), t=1,..255, where each Gray level t appears in the statistical Gray image Gray (i, j). For example, h (0) =9, and the Gray value of 9 pixels in the Gray image Gray (i, j) is 0, and h (0), h (1), h (255) are calculated in this order.
The number of occurrences of the gray level t is converted into a probability distribution p (t):
p (t) =h (t)/(image width) image height
Calculating a gray level cumulative probability distribution hp (t):
wherein t represents a gray level, t e [0,255], and k represents a kth gray level;
contrast of the Gray image Gray (i, j) is enhanced:
Gray(i,j) = 255 * hp(Gray(i,j))
and (3) carrying out k-means clustering on Gray images Gray (i, j) in a row mode, and setting the classification number as 2. For example, the above-mentioned 90-day statistical period is taken as an example, that is, the rows of the Gray image Gray (i, j) are clustered, 90 rows are counted, and the euclidean distance of each row of data is:
wherein s1 and s2 represent two days out of 90 days. According to the k-means clustering, 2 clustering centers C1 and C2 are obtained, whether C1 and C2 fall in a long term (the previous 70 days) and a near term (the later 20 days) respectively is judged, and if the C1 and C2 fall in the long term and the near term respectively, the user is judged to be inconsistent with the near term rule.
A Gray image Gray (i, j) shown in fig. 6 is generated, the abscissa represents 24 hours of a day, the ordinate represents the date, and the lighter the color, the larger the gas consumption. As can be seen from fig. 6, the time period from 1 month, 1 day to about 11 months, at which the gas usage is greatest, is concentrated at about 11 to 13 pm, and at about 15 to 17 pm, and the time period, at which the color is particularly deep, indicates that the gas usage is little or no. After 11 months, the moment with the maximum gas consumption is concentrated at about 4 to 6 points in the morning, and no gas is used or the gas consumption is very small in other times, which is quite different from the long-term gas consumption habit, so that the recent abnormality is indicated.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A data mining method for gas stealing behavior is characterized in that: the method comprises the following steps:
acquiring air consumption data of a user, analyzing the air consumption data, and judging whether the user is an abnormal user or not;
the judging of the abnormal user includes:
negative anomaly user: the gas consumption data of the users are formed into a sequence taking the day as a unit, and if the number of the non-monotonic points in the sequence is larger than a set threshold value, the abnormal users are judged;
zero anomaly user: screening out the number of days of continuous zero gas consumption from the gas consumption data of the users, calculating the cumulative probability of poisson distribution of the users, and judging the users as abnormal if the cumulative probability is larger than a set probability threshold;
small air consumption user: carrying out fractal dimension processing on the gas consumption data of the user according to the number of days, counting fractal distance sequences of all days, and calculating the skewness and loss of the fractal distance sequences so as to judge whether the user is an abnormal user;
the specific judging step of the small air consumption user comprises the following steps:
counting the gas consumption data of the users according to the date and the accumulated gas consumption, starting with the starting date, converting the data into the number of days and the accumulated gas consumption, and calculating a correlation coefficient:
wherein X is the number of days after conversion,is the average value of X, Y is the accumulated air consumption, ">Average Y, n is total days, i is the i-th day;
setting a coefficient thresholdFor less than coefficient threshold->Fractal dimension processing is carried out on the air consumption data of the user to obtain an air consumption curve chart of n days, wherein the abscissa of the air consumption curve chart is the number of days, and the ordinate of the air consumption curve chart is the accumulated air consumption;
calculating the distance d between day 1 and day 2 11 Distance d between day 2 and day 3 12 And analogizing to obtain d 1 =d 11 +d 12 +...;
Calculating the distance d between day 1 and day 3 21 Distance d between day 3 and day 5 22 And analogizing to obtain d 2 =d 21 +d 22 +...;
Calculating the distance d between day 1 and day 4 31 Distance d between day 4 and day 7 32 And analogizing to obtain d 3 =d 31 +d 32 +...;
Until the distance between the 1 st day and the n th day is calculated to obtain d n-1
Calculate the sequence d= { d 1 ,d 2 ,...,d n-1 Skewness Skew:
wherein ,for the average value of the sequence d, +.>Standard deviation of sequence d, n is total number of days;
calculate the Loss of sequence d:
wherein, max (d) is the maximum value in the sequence d, min (d) is the minimum value in the sequence;
setting a bias threshold and a Loss threshold, and judging that the air consumption is small if the calculated bias Shew is larger than the bias threshold or the calculated Loss is larger than the Loss threshold;
recent rule inconsistent users: and constructing a gray scale graph according to the gas consumption data of the users in hours/days, and searching abnormal users from the gray scale graph.
2. The data mining method for gas theft according to claim 1, wherein: the specific judging step of the negative abnormal user comprises the following steps:
taking the gas consumption data of the user as a unit, and acquiring the data of the connection of the first reading and the last reading of each day to form a sequence, wherein the total number of the sequences is n; arranging n sequences in time sequence, counting the number m of the non-monotonic points in the n sequences, and calculating the proportion P=m/n of the non-monotonic points;
setting a threshold T1, and if P is less than T1, listing the gas equipment of the user into equipment to be inspected;
setting threshold values T2 and T3, calculating the negative abnormal size G of equipment to be checked, and judging a first-level abnormal user if G is less than T2; if T2 is not less than G and not more than T3, judging that the user is a second-level abnormal user; if T3 is less than G, judging that the user is a three-level abnormal user.
3. The data mining method for gas theft according to claim 2, wherein: the step of arranging n sequences in time sequence and counting the number m of non-monotonic points in the n sequences comprises the following steps:
extracting a first reading x1 and a last reading x2 of the same day, and extracting a minimum reading y1 and a maximum reading y2 of the same day;
if the minimum reading y1 is smaller than the first reading x1 or the maximum reading y2 is larger than the last reading x2, judging that the sequence on the current day is a non-monotonic point;
or the last reading x2 of the day is smaller than the first reading x1, and the sequence of the day is also judged to be a non-monotonic point.
4. The data mining method for gas theft according to claim 1, wherein: the specific judging step of the zero-abnormity user comprises the following steps:
screening the number of days of continuous zero gas consumption from the gas consumption data of the users, taking the starting time to the ending time of the zero gas consumption as a section, and counting the number of sections of the zero gas consumption of each user gas equipment;
establishing a poisson distribution model by using the zero gas consumption segment numbers of all the user gas equipment, calculating the cumulative probability of the poisson distribution model, and setting a probability threshold valueIf the cumulative probability is greater than the probability threshold +.>And (3) judging the user as an abnormal user.
5. The data mining method for gas theft according to claim 1, wherein: the specific judging step of the user with inconsistent recent rules comprises the following steps:
acquiring gas consumption data of a user per hour by taking a day as a unit, constructing a two-dimensional matrix Mat (i, j) for the gas consumption data per hour, wherein the gas consumption data of the j-th hour on the i-th day is represented, the abscissa of the two-dimensional matrix is the hour, the ordinate is the date, and each element of the two-dimensional matrix represents the gas consumption of one hour;
normalizing the two-dimensional matrix Mat (i, j), and quantizing the matrix Mat (i, j) to an integer between 0 and 255 as a Gray image Gray (i, j):
counting the number h (t) of pixels in each Gray level t in the Gray image Gray (i, j);
the number of occurrences of the gray level t is converted into a probability distribution p (t):
p (t) =h (t)/(image width) image height
Calculating a gray level cumulative probability distribution hp (t):
wherein t represents a gray level, t e [0,255], and k represents a kth gray level;
contrast of the Gray image Gray (i, j) is enhanced:
Gray(i,j) = 255 * hp(Gray(i,j))
k-means clustering is carried out on Gray images Gray (i, j) in a row mode, the classification number is set to be 2, and the Euclidean distance of data in two days is as follows:
wherein s1 and s2 represent any two days;
and 2 clustering centers C1 and C2 are obtained according to k-means clustering, a long-term range and a near-term range are set, whether the C1 and the C2 fall in the long-term range and the near-term range respectively is judged, and if the C1 and the C2 fall in the long-term range and the near-term range respectively, the user with inconsistent near-term rules is judged.
CN202211621446.7A 2022-12-16 2022-12-16 Data mining method for gas stealing behavior Active CN116150239B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211621446.7A CN116150239B (en) 2022-12-16 2022-12-16 Data mining method for gas stealing behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211621446.7A CN116150239B (en) 2022-12-16 2022-12-16 Data mining method for gas stealing behavior

Publications (2)

Publication Number Publication Date
CN116150239A CN116150239A (en) 2023-05-23
CN116150239B true CN116150239B (en) 2023-09-22

Family

ID=86338077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211621446.7A Active CN116150239B (en) 2022-12-16 2022-12-16 Data mining method for gas stealing behavior

Country Status (1)

Country Link
CN (1) CN116150239B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104391202A (en) * 2014-11-27 2015-03-04 国家电网公司 Abnormal electricity consumption judging method based on analysis of abnormal electric quantity
CN107742127A (en) * 2017-10-19 2018-02-27 国网辽宁省电力有限公司 A kind of improved anti-electricity-theft intelligent early-warning system and method
CN108256752A (en) * 2018-01-02 2018-07-06 北京市燃气集团有限责任公司 A kind of analysis method of gas user gas behavior
CN110458230A (en) * 2019-08-12 2019-11-15 江苏方天电力技术有限公司 A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method
CN111738364A (en) * 2020-08-05 2020-10-02 国网江西省电力有限公司供电服务管理中心 Electricity stealing detection method based on combination of user load and electricity consumption parameter
CN113343056A (en) * 2021-05-21 2021-09-03 北京市燃气集团有限责任公司 Method and device for detecting abnormal gas consumption of user
CN113407797A (en) * 2021-08-18 2021-09-17 成都千嘉科技有限公司 Data mining method for gas stealing behavior by utilizing fractal calculation
CN114757270A (en) * 2022-03-30 2022-07-15 重庆合众慧燃科技股份有限公司 NB-IoT (NB-IoT) based gas intelligent equipment anomaly analysis method system and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9595006B2 (en) * 2013-06-04 2017-03-14 International Business Machines Corporation Detecting electricity theft via meter tampering using statistical methods
GB2561916B (en) * 2017-04-28 2021-09-22 Gb Gas Holdings Ltd Method and system for detecting anomalies in energy consumption

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104391202A (en) * 2014-11-27 2015-03-04 国家电网公司 Abnormal electricity consumption judging method based on analysis of abnormal electric quantity
CN107742127A (en) * 2017-10-19 2018-02-27 国网辽宁省电力有限公司 A kind of improved anti-electricity-theft intelligent early-warning system and method
CN108256752A (en) * 2018-01-02 2018-07-06 北京市燃气集团有限责任公司 A kind of analysis method of gas user gas behavior
CN110458230A (en) * 2019-08-12 2019-11-15 江苏方天电力技术有限公司 A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method
CN111738364A (en) * 2020-08-05 2020-10-02 国网江西省电力有限公司供电服务管理中心 Electricity stealing detection method based on combination of user load and electricity consumption parameter
CN113343056A (en) * 2021-05-21 2021-09-03 北京市燃气集团有限责任公司 Method and device for detecting abnormal gas consumption of user
CN113407797A (en) * 2021-08-18 2021-09-17 成都千嘉科技有限公司 Data mining method for gas stealing behavior by utilizing fractal calculation
CN114757270A (en) * 2022-03-30 2022-07-15 重庆合众慧燃科技股份有限公司 NB-IoT (NB-IoT) based gas intelligent equipment anomaly analysis method system and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于数据驱动的餐饮燃气用户异常用气检测;杨筱都;《中国优秀硕士学位论文全文数据库》;全文 *
基于聚类分析和用户画像的用气负荷异常检测;胡殿涛;《煤气与热力》;第42卷(第4期);全文 *

Also Published As

Publication number Publication date
CN116150239A (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN111641519B (en) Abnormal root cause positioning method, device and storage medium
CN111475804B (en) Alarm prediction method and system
CN109583680B (en) Power stealing identification method based on support vector machine
WO2016029570A1 (en) Intelligent alert analysis method for power grid scheduling
JP5753286B1 (en) Information processing apparatus, diagnostic method, and program
CN111173565B (en) Mine monitoring data abnormal fluctuation early warning method and device
CN112257755B (en) Method and device for analyzing running state of spacecraft
CN108319649B (en) System and method for improving quality of water regime and water-diversion data
Zharkova et al. Solar feature catalogues in EGSO
CN107391515A (en) Power system index analysis method based on Association Rule Analysis
CN112257013A (en) Electricity stealing user identification and positioning method based on dynamic time warping algorithm for high-loss distribution area
CN115018343A (en) System and method for recognizing and processing abnormity of mass mine gas monitoring data
KR20190082715A (en) Data classification method based on correlation, and a computer-readable storege medium having program to perform the same
CN116150239B (en) Data mining method for gas stealing behavior
CN115457403A (en) Intelligent crop identification method based on multi-type remote sensing images
CN110164102B (en) Photovoltaic power station string abnormity alarm method and alarm device
Petitjean et al. Discovering significant evolution patterns from satellite image time series
CN112966017B (en) Abnormal subsequence detection method for indefinite length in time sequence
CN112888008B (en) Base station abnormality detection method, device, equipment and storage medium
KR101997580B1 (en) Data classification method based on correlation, and a computer-readable storege medium having program to perform the same
CN113343056A (en) Method and device for detecting abnormal gas consumption of user
CN116645329A (en) Abnormality monitoring method for instrument and meter cabinet
CN108319573B (en) Energy statistical data based abnormity judgment and restoration method
JP3082548B2 (en) Equipment management system
US20240160620A1 (en) Time-series data processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant