CN115330202B - Low-voltage distribution area electricity larceny analysis method based on data driving - Google Patents

Low-voltage distribution area electricity larceny analysis method based on data driving Download PDF

Info

Publication number
CN115330202B
CN115330202B CN202210976257.5A CN202210976257A CN115330202B CN 115330202 B CN115330202 B CN 115330202B CN 202210976257 A CN202210976257 A CN 202210976257A CN 115330202 B CN115330202 B CN 115330202B
Authority
CN
China
Prior art keywords
line loss
data
electricity
electric quantity
zone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210976257.5A
Other languages
Chinese (zh)
Other versions
CN115330202A (en
Inventor
吕家慧
谭伟
孙敬科
迟子悦
郑和稳
郑一鹏
孔健沣
刘海峰
张晓峰
黄良栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yantai Dongfang Wisdom Electric Co Ltd
Original Assignee
Yantai Dongfang Wisdom Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yantai Dongfang Wisdom Electric Co Ltd filed Critical Yantai Dongfang Wisdom Electric Co Ltd
Priority to CN202210976257.5A priority Critical patent/CN115330202B/en
Publication of CN115330202A publication Critical patent/CN115330202A/en
Application granted granted Critical
Publication of CN115330202B publication Critical patent/CN115330202B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses a data-driven low-voltage distribution transformer area electricity larceny analysis method, which comprises the following steps: calculating the data characteristics of the area to be analyzed; counting similar areas of the areas to be analyzed, constructing a line loss identification model, and decomposing line loss characteristics of the areas to be analyzed; constructing a knowledge base and defining logic deduction rules of the knowledge base; and combining line loss characteristics and data characteristics of users under the to-be-analyzed platform area to form a sample vector group, and carrying out electricity larceny analysis by utilizing the knowledge base. The method is based on data driving to carry out the electricity larceny analysis of the transformer area, does not need a large number of electricity larceny samples with labels for training, has wider use value, and is suitable for the electricity larceny analysis of the low-voltage distribution transformer area under the complex line loss environment.

Description

Low-voltage distribution area electricity larceny analysis method based on data driving
Technical Field
The invention relates to the technical field of electric power monitoring, in particular to a low-voltage distribution transformer area electricity larceny analysis method based on data driving.
Background
The current more common method for realizing the electricity larceny analysis of the distribution area comprises the following steps: firstly, a traditional machine learning-based electricity larceny analysis method is generally a multi-classification model or a logistic regression model, and electricity larceny data samples are collected to train an electricity larceny research model, the method belongs to a classical machine learning algorithm, the fitting performance and the running speed are well represented, but the method needs the label value of the samples, namely whether the electricity larceny data are users or not, the electricity larceny information is sensitive, and the quantity/quality of the data are high; secondly, a power stealing analysis method based on a constraint least square method is not supported by a sample tag, the power stealing rule is mapped into a coefficient of a regression algorithm, constraint and shrinkage coefficients are in a specified range, the power stealing of a user is judged through the analysis coefficient, and common inhaul cable regression and ridge regression are carried out, but the method has weaker applicability and is not suitable for the situations that the correlation between the electric quantity of the user and the line loss of a platform area is weaker and the average value fluctuation of the relative line loss of a calculation period is overlarge (such as the situations of line flying before a meter, intermittent power stealing and the like); thirdly, a branch monitoring unit is additionally arranged, the line electric quantity is collected through the device and compared with the electric quantity of a tail end metering device (intelligent switch and ammeter), and the method belongs to the hardware-implemented electricity larceny judgment, is accurate and short in calculation period, but relates to the work of manual point selection, installation, test and the like, is high in hardware and labor cost, and does not have popularization conditions in a low-voltage power distribution area.
Disclosure of Invention
The invention provides a low-voltage distribution transformer area electricity larceny analysis method based on data driving, which aims at: the defect of the prior art is overcome, and the electricity larceny analysis method with strong applicability is provided on the premise that a large number of sample labels are not needed to support and the hardware cost is not needed to be increased.
The technical scheme of the invention is as follows:
a low-voltage distribution transformer area electricity larceny analysis method based on data driving comprises the following steps:
s1: calculating a zone Z to be analyzed * Is a data feature of (1);
s2: counting the Z of the area to be analyzed * Is similar to zone Z of (1) i Constructing a line loss identification model, and decomposing the line loss characteristics of the area to be analyzed;
s3: constructing a knowledge base and defining logic deduction rules of the knowledge base;
s4: to-be-analyzed area Z * Combining the line loss characteristics and the data characteristics of the lower user to form a sample vector group, and utilizing theAnd the knowledge base performs electricity stealing analysis.
Further, the method for constructing the line loss identification model in step S2 includes:
s21: the line loss characteristics of all electricity stealing behavior types are abstracted into the following three types:
i: the electricity consumption of the electricity stealing user has a correlation with the line loss electricity;
j: the electricity consumption of the electricity stealing user has no correlation with the line loss electricity;
k: the line loss electric quantity is close to a constant value;
the line loss sequence expression is as follows:
I={a 1 (X 1 ) 1 +…+a n (X n ) 1 ,...,a 1 (X 1 ) m +…+a n (X n ) m }
J={b 1 ,b 2 ,...,b m }
Figure BDA0003798538540000021
wherein, (X n ) m For zone Z to be analyzed * I is the addition sequence of the electric quantity multiplication coefficients of the electric meter, J is the independent electric quantity independent sequence, and K is the electric quantity constant sequence;
s22: constructing a line loss model of the station area:
Figure BDA0003798538540000031
Figure BDA0003798538540000032
a j ≥0,b j ≥0,c≥0
wherein eta 1 、η 2 、η 3 Is super parameter, is used for restraining the space of searching and solving, m is the time of electric quantity data, D represents the zone Z to be analyzed * Line loss of electric quantity d LIP For moving multiple linesPosition distance.
Will j= { b 1 ,b 2 ,...,b m Split into sub-sequences J sub Will J sub With similar zone Z i The electric quantity curve sequence in the (d) is used for calculating the position distance of the mobile multiline LIP The calculation formula is as follows:
Figure BDA0003798538540000033
Figure BDA0003798538540000034
wherein t is p Is J sub With zone Z i Intersection point of medium electric quantity curve sequence, area p Is J sub With zone Z i Polygonal area surrounded by medium electric quantity curve.
In turn by Z i The corresponding point of each electric quantity value on the middle electric quantity curve sequence is taken as a reference, and a certain J is slid in the vertical and horizontal directions sub Sequence, and calculate J sub And Z is i Is used to record the minimum distance matched during the sliding process.
All J' s sub And summing the matched minimum distances to be used as a constraint term of the line loss model.
Further, the method for decomposing the line loss characteristics of the to-be-analyzed platform area in step S2 is as follows: and (3) carrying out solution space search on minLoss (a, b and c) by using a meta-heuristic algorithm, and judging I, J, K the numerical value of each line loss characteristic when the algorithm converges to obtain the line loss characteristic of the station area.
Further, the method for constructing the knowledge base in step S3 is as follows: the method comprises the steps of counting data features of known electricity stealing users and suspected electricity stealing users under a system, carrying out knowledge creation by combining power business knowledge, establishing a relation from the data features and line loss features to electricity stealing events, and forming a knowledge base, wherein the associated line loss features in the knowledge base are carried out according to the following principle:
A. when the line loss electric quantity and the user electric quantity are generated by the electric equipment in the same period and the same group, the line loss is characterized by I;
B. when the line loss electric quantity is generated by completely independent electric equipment, the line loss is characterized by J;
C. when the line loss electric quantity can be controlled or intervened manually, the line loss is characterized by K;
if a power theft event can satisfy a plurality of A, B, C, a plurality of corresponding knowledge of the power theft event is formed.
The knowledge base logic deduction rule comprises the following three results when a user steals electricity samples are judged:
firstly, the electricity stealing sample must belong to a knowledge base set, namely the line loss characteristic of the electricity stealing sample can be formed by combining one knowledge or a plurality of knowledge in the knowledge base, and the user is judged to be a suspected electricity stealing user;
secondly, describing that the electricity stealing sample belongs to or does not belong to a knowledge base set through approximate knowledge, calculating the matching degree by taking columns in the knowledge base as attributes, selecting a plurality of knowledge combinations with the matching degree larger than a preset value, and judging that a user is a suspected electricity stealing user according to electricity stealing events with dominant positions in the plurality of knowledge combinations;
third, the electricity theft sample must not belong to the knowledge base set, and the user excludes electricity theft.
Further, the knowledge base logic deriving rule in step S3 further includes a superposition operator for defining I, J, K line loss characteristics:
Figure BDA0003798538540000041
Figure BDA0003798538540000042
Figure BDA0003798538540000043
where n denotes that both data sequences occur simultaneously,
Figure BDA0003798538540000051
the new electricity stealing user electricity consumption and the line loss electricity consumption are not related when I and J are overlapped * ,/>
Figure BDA0003798538540000054
Indicating that when I and K are overlapped, a new correlation I is generated between the electricity consumption of the electricity stealing user and the line loss * ,/>
Figure BDA0003798538540000052
The characteristic of trend is not changed when J and K are overlapped, and the overlapped result is J.
When a user steals electricity samples is judged, if the line loss characteristics of the electricity sample can be formed by combining a plurality of pieces of knowledge in a knowledge base, when the plurality of pieces of knowledge are subjected to superposition operation, the line loss characteristics are subjected to superposition operation according to the definition of superposition operation operators, the data characteristics are operated according to a larger value dominant principle, corresponding electricity stealing events are superposed to form an array set, and the user is judged to be a suspected electricity stealing user containing various electricity stealing behaviors.
Further, in the step S1, the zone Z to be analyzed is calculated * Is first to treat the analysis zone Z before the data features of (a) * The data filling processing is carried out, and the method comprises the following steps:
s11: zone Z to be analyzed * The data matrix form of (a) is:
Figure BDA0003798538540000053
wherein (X) n ) m Representing electric quantity of electric meter, n is electric meter identification under the station area, m is electric quantity data moment, X 0 In particular to a table zone summary table;
longitudinal axis direction (X) 0 ) i (X 1 ) i …(X n ) i If the number of missing data exceeds the set value, deleting the electric quantity data at the moment, and turning to the step S12, otherwise, directly executing the step S12.
S12: for missing data in the matrix, the data is displayed in the horizontal axis direction (X i ) 1 (X i ) 2 …(X i ) m The method for filling the missing value by using polynomial interpolation comprises the following specific steps:
regarding the time of the electric quantity data and the electric quantity value as point coordinates on a two-dimensional plane, the time of the electric quantity data as abscissa, the electric quantity value as ordinate, and the electric quantity data at k times as (x) 1 ,y 1 )(x 1 ,y 1 )…(x k ,y k ) Constructing a k-1 degree polynomial, and substituting known k coordinates into the k-1 degree polynomial to form the k-1 degree polynomial, wherein the k-1 degree polynomial is as follows:
Figure BDA0003798538540000061
Figure BDA0003798538540000062
where L (x) is an interpolation expression, L j (x) As an interpolation basis function, x is a value of data time, y is a value of electric quantity, and j represents the number of addition;
in [1, k ]]Is introduced in the range of (x) the (k+1) th coordinate (x k+1 ,y k+1 ) Using the k-1 th order polynomial to bring in the coordinates (x k+1 ,y k+1 ) And (5) solving a corresponding electric quantity value, and completing the polynomial interpolation filling once.
S13: filling all missing data according to the method in the step S12 in sequence to finish the to-be-analyzed platform zone Z * Is updated with data of (a).
Further, the step S1 is to analyze the zone Z * When the data filling process is performed, k=24.
Further, step S2 is performed to count the Z-zone to be analyzed * Is similar to zone Z of (1) i The method of (1) is as follows:
batch extraction of the region vectors from the business system, and calculation of each region and the region Z to be analyzed by using cosine similarity * Distance between:
Figure BDA0003798538540000063
wherein alpha represents zone Z * Beta represents a platform region vector extracted in batches in a service system, alpha i 、β i Each component in the vectors alpha and beta respectively;
sequentially carrying out similarity calculation on alpha and a platform region vector beta in a service system, and selecting a platform region Z to be analyzed * The L nearest areas are used as similar areas Z i
Further, the step S2 further includes, according to the step S1, treating the analysis zone Z * Method for performing data filling processing on similar zone Z i And (5) performing data filling processing.
Compared with the prior art, the invention has the following beneficial effects:
(1) The method is based on data driving to analyze the electricity larceny of the transformer area, obtains the line loss characteristics of the transformer area by calculating the data characteristics of the transformer area to be analyzed, constructing a line loss identification model, constructing a knowledge base and an reasoning method thereof, analyzes the electricity larceny of the transformer area, does not need a large number of electricity larceny samples with labels for training, has wider use value, and is suitable for the electricity larceny analysis of the low-voltage distribution transformer area under the complex line loss environment;
(2) The method provides the concept that the line loss electric quantity can be abstracted into three types of line loss characteristics and the line loss electric quantity identification for the first time, and the method is used as a characteristic with higher importance, so that the characteristic dimension of line loss analysis is expanded from the data angle, and the characteristic is difficult to hide by a power stealing user, thereby greatly improving the accuracy of the line loss analysis;
(3) The method establishes a knowledge base, presents the characteristics of the electricity stealing users through superposition operation of the knowledge base, defines the superposition operation mode of the data characteristics and the line loss characteristics, flexibly considers the situation that the same electricity stealing user possibly has multiple electricity stealing behaviors compared with a single type of electricity stealing method, and improves the applicability and the application range.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of a method for computing data characteristics of a region;
FIG. 3 is a schematic diagram of user data characteristics under a cell;
FIG. 4 is a schematic diagram of region coding;
FIG. 5 is an abstract diagram of the line loss characteristics of the power theft behavior;
FIG. 6 is a schematic view of multiline location distances;
FIG. 7 is a schematic diagram of subscriber line loss characteristics under a bay;
FIG. 8 is a schematic diagram of a knowledge base;
FIG. 9 is a schematic diagram of user sample vectors under a bay;
FIG. 10 is a schematic diagram of a superposition of knowledge in a knowledge base.
Detailed Description
The technical scheme of the invention is described in detail below with reference to the accompanying drawings:
referring to fig. 1, a method for analyzing electricity stealing in a low-voltage distribution station based on data driving comprises the following steps:
s1: zone Z to be analyzed * Data filling processing is carried out, and a to-be-analyzed zone Z is calculated * Is a data feature of (a).
Specifically, the hour electric quantity data of all the electric meters under the area to be analyzed is taken and marked as Z * The data period was a minimum of 7 days.
Preferably, the zone Z to be analyzed * The method for performing the data filling process is as follows:
s11: zone Z to be analyzed * The data matrix form of (a) is:
Figure BDA0003798538540000081
wherein (X) n ) m Representing electric quantity of electric meter, n is electric meter identification under the station area, m is electric quantity data moment, X 0 In particular to a table zone summary table;
longitudinal axis direction (X) 0 ) i (X 1 ) i …(X n ) i If the number of missing data exceeds the set value (90%), deleting the electric quantity data at the moment, and turning to the step S12, otherwise, directly executing the step S12.
S12: for missing data in the matrix, the data is displayed in the horizontal axis direction (X i ) 1 (X i ) 2 …(X i ) m The method for filling the missing value by using polynomial interpolation comprises the following specific steps:
regarding the time of the electric quantity data and the electric quantity value as point coordinates on a two-dimensional plane, the time of the electric quantity data as abscissa, the electric quantity value as ordinate, and the electric quantity data at k times as (x) 1 ,y 1 )(x 1 ,y 1 )…(x k ,y k ) From the mathematical theorem, it is known that there must be a polynomial of degree k-1, such that (x 1 ,y 1 )(x 1 ,y 1 )…(x k ,y k ) The polynomial is introduced. Constructing a k-1 degree polynomial, and substituting known k coordinates into the k-1 degree polynomial to form the k-1 degree polynomial, wherein the k-1 degree polynomial is as follows:
Figure BDA0003798538540000091
Figure BDA0003798538540000092
where L (x) is an interpolation expression, L j (x) For interpolation basis function, x is the value of data time, y is the value of electric quantity, j represents the added number, if the k-1 degree polynomial has k-1 added number; i represents the serial number of the coordinate, the value of k is too large to embody the linear rule, and k=24 is taken in combination with the service condition.
In [1, k ]]Is introduced in the range of (x) the (k+1) th coordinate (x k+1 ,y k+1 ) Using the k-1 th order polynomial to bring in the coordinates (x k+1 ,y k+1 ) Solving a corresponding electric quantity value, and finishing one-time polynomial interpolation filling;
s13: filling all missing data sequentially according to the method in step S12Completing the zone Z to be analyzed * Is updated with data of (a).
After updating, extracting the zone Z to be analyzed * The table code, voltage/current, power and SOE event data of the same time period as the hour electric quantity data, and the data characteristics of the users under the areas are calculated, as shown in fig. 2, including: voltage (loss of voltage, open phase, out of limit), current (loss of current, negative current), power (excessive fluctuation, continuous decrease), meter code (sudden increase, sudden decrease), event (power failure, open cover), etc. As in fig. 3, the zone Z is obtained by calculation * Lower user data characteristics.
S2: counting the Z of the area to be analyzed * Is similar to zone Z of (1) i And constructing a line loss identification model, and decomposing the line loss characteristics of the area to be analyzed.
Preferably, the statistical zone Z to be analyzed * Is similar to zone Z of (1) i The method of (1) is as follows:
a plurality of normal line loss areas are extracted in batches from a service system, the hour electric quantity data of all electric meters under the areas are taken as similar area data, and the similar areas have certain similarity with the areas to be analyzed, and mainly comprise: geographical location, load factor, resident composition, commercial electricity proportion, etc., as in fig. 4, this part of the attributes is encoded, and then cosine similarity is used to calculate each zone and the zone Z to be analyzed * Distance between:
Figure BDA0003798538540000101
wherein alpha represents zone Z * Is the zone Z, as a= (0.2,1,0,4) * Beta represents a platform region vector extracted in batches in a service system, alpha i 、β i Each component in the vectors alpha and beta respectively; sequentially carrying out similarity calculation on alpha and a platform region vector beta in a service system, selecting a platform region represented by L (L=5) vectors closest to the alpha as a similar platform region, and marking the similar platform region as Z i . The similarity range is [ -1,1]1 indicates the same direction, -1 indicates opposite direction, and 0 indicates that the two vectors are independent.
Further, the method comprises the steps of,according to the step S1, the zone Z to be analyzed * Method for performing data filling processing on similar zone Z i And (5) performing data filling processing.
The method for constructing the line loss identification model comprises the following steps:
s21: as shown in fig. 5, the line loss characteristics of all the electricity stealing behavior types (short circuit type, modification parameter, modification structure type, no-meter electricity consumption type, fault type) are abstracted into the following three types:
i: the electricity consumption of the electricity stealing users has a correlation with the line loss electricity, when the related operations such as the parameter or structure of the electricity meter are modified, the metering of the electricity meter is slowed down or the precision is reduced, and at the moment, the line loss electricity of the station area has an obvious correlation with the electricity consumption of the electricity stealing users;
j: the electricity consumption of the electricity stealing user has no correlation with the line loss electricity, when the electricity stealing user flies off the line to use electricity and the like through the related operation of the meter, the line loss electricity is completely generated by an independent electric equipment group, and the line loss electricity has only electricity consumption characteristics and has no correlation with the electricity consumption of the user;
k: the line loss power is close to a constant value.
The single state line loss sequence expression is as follows:
I={a 1 (X 1 ) 1 +…+a n (X n ) 1 ,...,a 1 (X 1 ) m +…+a n (X n ) m }
J={b 1 ,b 2 ,...,b m }
Figure BDA0003798538540000111
wherein, (X n ) m For zone Z to be analyzed * I is the sum sequence of the electric power multiplication coefficients of the electric meter, J is the independent power consumption independent sequence, K is the constant sequence of the electric power, and a small number of 0 can be allowed to exist in consideration of the power consumption intermittence, a n (X n ) m As an element in the line loss sequence I, the line loss value thereof is represented, b m The line loss value of the line loss sequence J is represented, and {0, c } represents the line loss value of the line loss sequence K.
S22: the method comprises the steps of constructing a line loss model of a platform area, dividing the line loss value of the platform area into I, J, K components, and decomposing the characteristics of the line loss of the platform area, wherein the characteristics are as follows:
Figure BDA0003798538540000121
Figure BDA0003798538540000122
a j ≥0,b j ≥0,c≥0
wherein eta 1 、η 2 、η 3 Is super parameter, is used for restraining the space of searching and solving, m is the time of electric quantity data, D represents the zone Z to be analyzed * The line loss electricity quantity of the transformer is obtained by subtracting all the user electricity meters under the transformer area from the total electricity quantity and is recorded as
Figure BDA0003798538540000123
d LIP For moving the multiline location distances.
Will j= { b 1 ,b 2 ,...,b m Split into sub-sequences J sub Will J sub With similar zone Z i The electric quantity curve sequence in the (d) is used for calculating the position distance of the mobile multiline LIP The calculation formula is as follows:
Figure BDA0003798538540000124
Figure BDA0003798538540000125
wherein, as shown in FIG. 6, t p Is J sub With zone Z i Intersection point of medium electric quantity curve sequence, area p Is J sub With zone Z i The Area of the polygonal Area surrounded by the medium electric quantity curve is defined as Area p When both are 0, it is indicated that the two tracks are coincidentThere is no gap.
The specific method for calculation comprises the following steps: taking no consideration of the sequence value, taking consideration of only sequence trend characteristics, and sequentially taking Z as i The corresponding point of each electric quantity value on the middle electric quantity curve sequence is taken as a reference, and a certain J is slid in the vertical and horizontal directions sub Sequence, and calculate J sub And Z is i Recording the minimum distance matched in the sliding process; all J' s sub And summing the matched minimum distances to be used as a constraint term of the line loss model.
The method for decomposing the line loss characteristics of the to-be-analyzed area comprises the following steps: minLoss (a, b, c) is a non-explicit expression optimization problem, using meta-heuristic algorithm to search solution space, where the accuracy of solution is not required, no specific value of each line loss feature is required to be calculated, only qualitative result is required, when algorithm converges, the numerical value of each line loss feature is determined I, J, K, if a j Tending to 0 to represent I-free features, b j Trending to 0 represents J-free features, trending to 0 represents K-free features, and otherwise, representing the existence of corresponding line loss features, thereby obtaining the line loss features of the transformer area.
Line loss characteristics of line loss electric quantity of user relay platform area under platform area, thereby obtaining platform area Z * The lower subscriber line loss feature is shown in fig. 7.
S3: and constructing a knowledge base and defining logic deduction rules of the knowledge base.
As shown in fig. 8, the method for constructing the knowledge base is as follows: analyzing and aggregating data under the existing system, including: and (3) counting the data characteristics of known electricity stealing users and suspected electricity stealing users under the system, carrying out knowledge creation by combining the power business knowledge, and establishing the relationship from the data characteristics and the line loss characteristics to the electricity stealing events to form a knowledge base. Considering the electricity consumption condition when the electricity stealing event occurs, the associated line loss characteristics in the knowledge base are carried out according to the following principles:
A. when the line loss electric quantity and the user electric quantity are generated by the electric equipment in the same period and the same group, the line loss is characterized by I;
B. when the line loss electric quantity is generated by completely independent electric equipment, the line loss is characterized by J;
C. when the line loss electric quantity can be controlled or intervened manually, the line loss is characterized by K;
if a power theft event can satisfy a plurality of A, B, C, a plurality of corresponding knowledge of the power theft event is formed.
In the knowledge base, 0 represents no statistics of the characteristic, 1 represents statistics times of 1-3 times, 2 represents statistics times of more than 3 times, and I, J, K is a data value of the line loss characteristic. The knowledge base is essentially a multi-class decision set with uncertainty, namely, the same electricity stealing event can have multiple pieces of knowledge, which accords with the characteristics of electricity stealing behavior, and the same electricity stealing behavior can show different characteristics.
The knowledge base logic deduction rule comprises the following three results when a user steals electricity samples are judged:
firstly, the electricity stealing sample must belong to a knowledge base set, namely the line loss characteristic of the electricity stealing sample can be formed by combining one knowledge or a plurality of knowledge in the knowledge base, and the user is judged to be a suspected electricity stealing user;
secondly, describing that the electricity stealing sample belongs to or does not belong to a knowledge base set through approximate knowledge, calculating the matching degree by taking columns in the knowledge base as attributes, selecting a plurality of knowledge combinations with the matching degree larger than a preset value, and judging that a user is a suspected electricity stealing user according to electricity stealing events with dominant positions in the plurality of knowledge combinations;
third, the electricity theft sample must not belong to the knowledge base set, and the user excludes electricity theft.
Further, the knowledge base logic deriving rule further comprises a superposition operator for defining I, J, K line loss characteristics:
Figure BDA0003798538540000141
Figure BDA0003798538540000142
Figure BDA0003798538540000143
where n represents the coincidence of two data sequences, the above equation describes the relationship when I, J, K is combined:
Figure BDA0003798538540000144
when I and J are superposed (namely, irrelevant relation is superposed with related relation), a new electricity consumption of the electricity stealing user and line loss electricity quantity are not related with each other * ;/>
Figure BDA0003798538540000145
When I and K are superposed, the Y-axis component is increased, the correlation is not changed, but the coefficient of the correlation is pulled up, so that a new correlation I between the electricity consumption of the electricity larceny and the line loss electricity is generated *
Figure BDA0003798538540000146
When J and K are superposed, the Y-axis component is added to the trend only through trend feature analysis, the trend feature is not changed, and the superposition result is still J.
When a user steals electricity samples is judged, if the line loss characteristics of the electricity sample can be formed by combining a plurality of pieces of knowledge in a knowledge base, when the plurality of pieces of knowledge are subjected to superposition operation, the line loss characteristics are subjected to superposition operation according to the definition of superposition operation operators, the data characteristics are operated according to a larger value dominant principle, corresponding electricity stealing events are superposed to form an array set, and the user is judged to be a suspected electricity stealing user containing various electricity stealing behaviors.
S4: to-be-analyzed area Z * And combining the line loss characteristics and the data characteristics of the lower user to form a sample vector group, and carrying out electricity larceny analysis by using the knowledge base.
With zone Z to be analyzed * For example, as shown in fig. 9, the user a may recognize that there may be a plurality of line loss features, and the associated line loss features are combined with the user data features to form a sample vector set. The data characteristic of the user A is combined with I, K line loss characteristic of line loss identification decomposition to form two sample vectors。
When the line loss analysis is performed, the user a is taken as an example continuously, and the method specifically comprises the following steps:
(1) The sample vector of the user A can be completely matched by one piece of knowledge in the knowledge base, and the matched user A is a suspected electricity stealing user;
the sample vector of the user A cannot be obtained by matching one knowledge in the knowledge base, but can be represented by a plurality of knowledge ∈
Figure BDA0003798538540000151
The data characteristics are governed by larger values, such as 0 n 2 = max (0, 2), and the power theft events are superimposed to form an array set, as shown in fig. 10. At this time, the user A is a suspected electricity larceny user with 2 kinds of electricity larceny behaviors, which is consistent with the actual situation, and the user A can be matched to be the suspected electricity larceny user.
(2) The sample vector of the user A cannot be represented by one or more pieces of knowledge ∈n in a knowledge base, at this time, several groups of approximations are represented by the pieces of knowledge ∈n, then the matching degree is calculated by taking the columns as attributes, a plurality of knowledge combinations (matching degree=number of matching columns/total number of columns) with the matching degree of more than 70% are selected, a power stealing event set of the plurality of combinations meeting the matching degree is observed, a power stealing event with dominant positions in the plurality of combinations is selected (the occurrence frequency is far greater than other events), and the matching user A is a suspected power stealing user.
(3) If the user A can not meet the matching mechanism, the user A is judged to be a normal electricity user.
The method provides a method system for observing and analyzing the electricity larceny of the transformer area from the data angle, and the electricity larceny analysis capability can be continuously optimized by iteratively updating the knowledge base, so that the method is suitable for future environments.

Claims (7)

1. The data-driven low-voltage distribution transformer area electricity larceny analysis method is characterized by comprising the following steps of:
s1: calculating a zone Z to be analyzed * Is a data feature of (1);
s2: counting the Z of the area to be analyzed * Is similar to zone Z of (1) i Constructing a line loss identification model, and decomposing the line loss characteristics of the area to be analyzed;
the method for constructing the line loss identification model comprises the following steps:
s21: the line loss characteristics of all electricity stealing behavior types are abstracted into the following three types:
i: the electricity consumption of the electricity stealing user has a correlation with the line loss electricity;
j: the electricity consumption of the electricity stealing user has no correlation with the line loss electricity;
k: the line loss electric quantity is close to a constant value;
the line loss sequence expression is as follows:
I={a 1 (X 1 ) 1 +…+a n (X n ) 1 ,...,a 1 (X 1 ) m +…+a n (X n ) m }
J={b 1 ,b 2 ,...,b m }
Figure FDA0004260166170000011
wherein, (X n ) m For zone Z to be analyzed * I is the addition sequence of the electric quantity multiplication coefficients of the electric meter, J is the independent electric quantity independent sequence, and K is the electric quantity constant sequence;
s22: constructing a line loss model of the station area:
Figure FDA0004260166170000012
Figure FDA0004260166170000013
a j ≥0,b j ≥0,c≥0
wherein eta 1 、η 2 、η 3 Is super parameter, is used for restricting the space of search solution, m is the time of electric quantity data, and D is substitutedTable zone Z to be analyzed * Line loss of electric quantity d LIP For moving multiline location distances; will j= { b 1 ,b 2 ,...,b m Split into sub-sequences J sub Will J sub With similar zone Z i The electric quantity curve sequence in the (d) is used for calculating the position distance of the mobile multiline LIP The calculation formula is as follows:
Figure FDA0004260166170000021
Figure FDA0004260166170000022
wherein t is p Is J sub With zone Z i Intersection point of medium electric quantity curve sequence, area p Is J sub With zone Z i The area of a polygonal area surrounded by the medium electric quantity curve;
in turn by Z i The corresponding point of each electric quantity value on the middle electric quantity curve sequence is taken as a reference, and a certain J is slid in the vertical and horizontal directions sub Sequence, and calculate J sub And Z is i Recording the minimum distance matched in the sliding process;
all J' s sub Summing the matched minimum distances to be used as a constraint term of the line loss model;
s3: constructing a knowledge base and defining logic deduction rules of the knowledge base;
the method for constructing the knowledge base comprises the following steps: the method comprises the steps of counting data features of known electricity stealing users and suspected electricity stealing users under a system, carrying out knowledge creation by combining power business knowledge, establishing a relation from the data features and line loss features to electricity stealing events, and forming a knowledge base, wherein the associated line loss features in the knowledge base are carried out according to the following principle:
A. when the line loss electric quantity and the user electric quantity are generated by the electric equipment in the same period and the same group, the line loss is characterized by I;
B. when the line loss electric quantity is generated by completely independent electric equipment, the line loss is characterized by J;
C. when the line loss electric quantity can be controlled or intervened manually, the line loss is characterized by K;
if one electricity stealing event can meet a plurality of A, B, C, forming a plurality of corresponding knowledge of the electricity stealing event;
the knowledge base logic deduction rule comprises the following three results when a user steals electricity samples are judged:
firstly, the electricity stealing sample must belong to a knowledge base set, namely the line loss characteristic of the electricity stealing sample can be formed by combining one knowledge or a plurality of knowledge in the knowledge base, and the user is judged to be a suspected electricity stealing user; secondly, describing that the electricity stealing sample belongs to or does not belong to a knowledge base set through approximate knowledge, calculating the matching degree by taking columns in the knowledge base as attributes, selecting a plurality of knowledge combinations with the matching degree larger than a preset value, and judging that a user is a suspected electricity stealing user according to electricity stealing events with dominant positions in the plurality of knowledge combinations;
thirdly, the electricity stealing sample does not necessarily belong to the knowledge base set, and the user can exclude electricity stealing;
s4: to-be-analyzed area Z * And combining the line loss characteristics and the data characteristics of the lower user to form a sample vector group, and carrying out electricity larceny analysis by using the knowledge base.
2. The data-driven low-voltage distribution transformer substation electricity larceny analysis method as set forth in claim 1, wherein: the method for decomposing the line loss characteristics of the to-be-analyzed platform area in the step S2 comprises the following steps: and (3) carrying out solution space search on minLoss (a, b and c) by using a meta-heuristic algorithm, and judging I, J, K the numerical value of each line loss characteristic when the algorithm converges to obtain the line loss characteristic of the station area.
3. The data-driven low-voltage distribution transformer substation electricity larceny analysis method as set forth in claim 1, wherein: the knowledge base logic deriving rule in step S3 further includes a superposition operator for defining I, J, K line loss characteristics:
Figure FDA0004260166170000031
Figure FDA0004260166170000032
Figure FDA0004260166170000033
where n denotes that both data sequences occur simultaneously,
Figure FDA0004260166170000035
the new electricity stealing user electricity consumption and the line loss electricity consumption are not related when I and J are overlapped * ,/>
Figure FDA0004260166170000034
Indicating that when I and K are overlapped, a new correlation I is generated between the electricity consumption of the electricity stealing user and the line loss * ,/>
Figure FDA0004260166170000041
The characteristic of trend is not changed when J and K are overlapped, and the overlapped result is J;
when a user steals electricity samples is judged, if the line loss characteristics of the electricity sample can be formed by combining a plurality of pieces of knowledge in a knowledge base, when the plurality of pieces of knowledge are subjected to superposition operation, the line loss characteristics are subjected to superposition operation according to the definition of superposition operation operators, the data characteristics are operated according to a larger value dominant principle, corresponding electricity stealing events are superposed to form an array set, and the user is judged to be a suspected electricity stealing user containing various electricity stealing behaviors.
4. The data-driven low-voltage distribution transformer substation electricity larceny analysis method as set forth in claim 1, wherein: the step S1 is to calculate the zone Z to be analyzed * Is first to treat the analysis zone Z before the data features of (a) * The data filling process is performed such that,the method comprises the following steps:
s11: zone Z to be analyzed * The data matrix form of (a) is:
Figure FDA0004260166170000042
wherein (X) n ) m Representing electric quantity of electric meter, n is electric meter identification under the station area, m is electric quantity data moment, X 0 In particular to a table zone summary table;
longitudinal axis direction (X) 0 ) i (X 1 ) i …(X n ) i If the number of the missing data exceeds the set value, deleting the electric quantity data at the moment, and turning to the step S12, otherwise, directly executing the step S12;
s12: for missing data in the matrix, the data is displayed in the horizontal axis direction (X i ) 1 (X i ) 2 …(X i ) m The method for filling the missing value by using polynomial interpolation comprises the following specific steps:
regarding the time of the electric quantity data and the electric quantity value as point coordinates on a two-dimensional plane, the time of the electric quantity data as abscissa, the electric quantity value as ordinate, and the electric quantity data at k times as (x) 1 ,y 1 )(x 1 ,y 1 )...(x k ,y k ) Constructing a k-1 degree polynomial, and substituting known k coordinates into the k-1 degree polynomial to form the k-1 degree polynomial, wherein the k-1 degree polynomial is as follows:
Figure FDA0004260166170000051
Figure FDA0004260166170000052
where L (x) is an interpolation expression, L j (x) As an interpolation basis function, x is a value of data time, y is a value of electric quantity, and j represents the number of addition;
in [1, k ]]Is introduced in the range of (x) the (k+1) th coordinate (x k+1 ,y k+1 ) Using the k-1 th order polynomial to bring in the coordinates (x k+1 ,y k+1 ) Solving a corresponding electric quantity value, and finishing one-time polynomial interpolation filling;
s13: filling all missing data according to the method in the step S12 in sequence to finish the to-be-analyzed platform zone Z * Is updated with data of (a).
5. The data-driven low-voltage distribution transformer substation electricity larceny analysis method as set forth in claim 4, wherein: step S1 is to analyze the zone Z * When the data filling process is performed, k=24.
6. The data-driven low-voltage distribution transformer substation electricity larceny analysis method as set forth in claim 1, wherein: step S2 of counting the Z-zone to be analyzed * Is similar to zone Z of (1) i The method of (1) is as follows: batch extraction of the region vectors from the business system, and calculation of each region and the region Z to be analyzed by using cosine similarity * Distance between:
Figure FDA0004260166170000053
wherein alpha represents zone Z * Beta represents a platform region vector extracted in batches in a service system, alpha i 、β i Each component in the vectors alpha and beta respectively;
sequentially carrying out similarity calculation on alpha and a platform region vector beta in a service system, and selecting a platform region Z to be analyzed * The L nearest areas are used as similar areas Z i
7. The data-driven low-voltage distribution transformer substation electricity larceny analysis method as set forth in claim 4, wherein: the step S2 further comprises the step of analyzing the zone Z to be analyzed according to the step S1 * Method for performing data filling processing on similar zone Z i And (5) performing data filling processing.
CN202210976257.5A 2022-08-15 2022-08-15 Low-voltage distribution area electricity larceny analysis method based on data driving Active CN115330202B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210976257.5A CN115330202B (en) 2022-08-15 2022-08-15 Low-voltage distribution area electricity larceny analysis method based on data driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210976257.5A CN115330202B (en) 2022-08-15 2022-08-15 Low-voltage distribution area electricity larceny analysis method based on data driving

Publications (2)

Publication Number Publication Date
CN115330202A CN115330202A (en) 2022-11-11
CN115330202B true CN115330202B (en) 2023-07-11

Family

ID=83922984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210976257.5A Active CN115330202B (en) 2022-08-15 2022-08-15 Low-voltage distribution area electricity larceny analysis method based on data driving

Country Status (1)

Country Link
CN (1) CN115330202B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103023149A (en) * 2012-12-12 2013-04-03 天津市电力公司 Intelligent power distribution terminal and intelligent power distribution system based on IEC61850
CN105373877A (en) * 2015-09-14 2016-03-02 江苏南瑞通驰自动化系统有限公司 Electricity utilization trend anomaly suspicion analysis and anti-electric-larceny monitoring system
KR102588688B1 (en) * 2016-05-12 2023-10-12 한국전자통신연구원 Method and system for analyzing data
CN111930802A (en) * 2020-08-01 2020-11-13 青岛鼎信通讯股份有限公司 Anti-electricity-stealing analysis method based on Lasso analysis
CN114019205A (en) * 2021-07-16 2022-02-08 国家电网有限公司技术学院分公司 Electricity stealing identification method and system
CN114862139B (en) * 2022-04-19 2023-12-22 国网江苏省电力有限公司南通供电分公司 Data-driven-based abnormal diagnosis method for line loss rate of transformer area

Also Published As

Publication number Publication date
CN115330202A (en) 2022-11-11

Similar Documents

Publication Publication Date Title
CN106572493B (en) Rejecting outliers method and system in LTE network
CN104217225B (en) A kind of sensation target detection and mask method
CN108985380B (en) Point switch fault identification method based on cluster integration
CN111914492B (en) Evolution optimization-based semi-supervised learning industrial process soft measurement modeling method
CN108804577B (en) Method for estimating interest degree of information tag
CN104966161A (en) Electric energy quality recording data calculating analysis method based on Gaussian mixture model
Long et al. A new approach for construction of geodemographic segmentation model and prediction analysis
CN107729469A (en) Usage mining method, apparatus, electronic equipment and computer-readable recording medium
Ye et al. Passenger flow prediction in bus transportation system using ARIMA models with big data
CN110348540B (en) Clustering-based method and device for screening transient power angle stability faults of power system
CN111966730A (en) Risk prediction method and device based on permanent premises and electronic equipment
CN114021425B (en) Power system operation data modeling and feature selection method and device, electronic equipment and storage medium
CN109145175B (en) Spatiotemporal data prediction method based on stacking integrated learning algorithm
CN115330202B (en) Low-voltage distribution area electricity larceny analysis method based on data driving
CN112199376B (en) Standard knowledge base management method and system based on cluster analysis
CN111370055B (en) Intron retention prediction model establishment method and prediction method thereof
CN111737993B (en) Method for extracting equipment health state from fault defect text of power distribution network equipment
CN110782128B (en) User occupation label generation method and device and electronic equipment
CN113609109B (en) Automatic scene information generation method based on data twinning
CN113835964B (en) Cloud data center server energy consumption prediction method based on small sample learning
CN112363465B (en) Expert rule set training method, trainer and industrial equipment early warning system
CN113111729A (en) Training method, recognition method, system, device and medium of personnel recognition model
CN116308190B (en) Work order full life cycle monitoring method based on energy Internet of things service system
Nguyen et al. Using drone and AI application for power transmission line inspection and maintenance: A case study in Vietnam
CN117289192A (en) Mutual inductor fault analysis method and device based on distributed coordination

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant