CN112330136A - Relevance mining method and device for abnormal electricity utilization analysis data set of large user - Google Patents

Relevance mining method and device for abnormal electricity utilization analysis data set of large user Download PDF

Info

Publication number
CN112330136A
CN112330136A CN202011203430.5A CN202011203430A CN112330136A CN 112330136 A CN112330136 A CN 112330136A CN 202011203430 A CN202011203430 A CN 202011203430A CN 112330136 A CN112330136 A CN 112330136A
Authority
CN
China
Prior art keywords
item
analysis data
user
frequent
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011203430.5A
Other languages
Chinese (zh)
Inventor
陆婋泉
周玉
邵雪松
蔡奇新
季欣荣
段梅梅
易永仙
崔高颖
祝宇楠
宋瑞鹏
高雨翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202011203430.5A priority Critical patent/CN112330136A/en
Publication of CN112330136A publication Critical patent/CN112330136A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/80Management or planning
    • Y02P90/82Energy audits or management systems therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Fuzzy Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Water Supply & Treatment (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for mining the relevance of an abnormal electricity utilization analysis data set of a large user, wherein the method comprises the following steps: carrying out support degree judgment on the abnormal electricity utilization analysis data set of the large user to obtain a plurality of screened frequent item sets; and performing confidence judgment on the abnormal electricity utilization analysis data set through the screened frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance. According to the invention, through discrimination of the support degree and the confidence degree, the relevance among the abnormal electricity utilization analysis data set item sets of the large users is excavated, the operation data volume of the abnormal electricity utilization analysis algorithm is reduced, the system analysis efficiency is effectively improved, and the method has a good application prospect.

Description

Relevance mining method and device for abnormal electricity utilization analysis data set of large user
Technical Field
The invention relates to a method and a device for mining relevance of an abnormal electricity utilization analysis data set of a large user, and belongs to the technical field of power systems.
Background
Along with the gradual increase of the dynamics of looking for the punishment of unusual power consumption action, the action of stealing electricity is more and more concealed, and technical means wherein is also higher and higher, and unusual power consumption action presents specialization, intellectuality, disguise. With the help of conventional statistical analysis being insufficient, the application of new generation artificial intelligence technology is becoming mainstream. The standardized data set is the basis of large-data machine learning analysis, but the abnormal electricity utilization analysis data set of a large user contains various data items and has large data volume, so that the data analysis calculation amount of the data set is extremely large, therefore, the abnormal electricity utilization analysis data set of the large user needs to be subjected to data mining, the association relation among the data items is determined, useless associated data are trimmed, the subsequent calculation amount is reduced, the application of an artificial intelligent analysis technology in the field of abnormal electricity utilization analysis of the user is met, and powerful support is provided for further realizing the intelligent analysis of the data and the deepened application of the data.
Disclosure of Invention
In order to overcome the defects, the invention discloses a method and a device for mining the relevance of an abnormal electricity utilization analysis data set of a large user, solves the problem that the subsequent calculation amount is large due to the large data volume of the abnormal electricity utilization analysis data set of the existing large user, and reduces the operation data volume of an abnormal electricity utilization analysis algorithm by mining the relevance among multi-source data.
In order to achieve the purpose, the invention adopts the technical scheme that: a relevance mining method for a large-user abnormal electricity utilization analysis data set comprises the following steps:
carrying out support degree judgment on the abnormal electricity utilization analysis data set of the large user to obtain a plurality of screened frequent item sets;
and performing confidence judgment on the abnormal electricity utilization analysis data set through the screened frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance.
Further, the large user abnormal electricity utilization analysis data set S is composed of n user information:
S={S1,S2,...,Sx,...,Sn} (29)
where x is 1,2, …, n, the xth user information SxComprises the following steps:
Sx={Lid,Lcl,Lcal,Ctv(t),Tag} (30)
wherein L isidFor locating information modules, LclIs a category information module, LcalFor calculating information modules, Ctv(T) is a time-varying information module, T is time, TagIs a calibration information module.
Further, the location information module LidThe method is used for marking various types of positioning information of the user and comprises the following steps:
Lid={nan,ncn,nid,nea,nlo,noi} (31)
wherein n isanNumbering elements, n, for the electric meter bureaucnNumbering elements, n, for usersidIs a user name element, neaAs a power-consuming address element, nloNumbering elements for lines, noiIdentity information elements of the collected object;
category information module LclThe method comprises the following steps:
Lcl={ncsc,nwm,nsc,ntc,non,netc,nml,nmm} (32)
wherein n iscscAs a user type element, nwmIs a wiring mode element, nscAs an element of the electric meter category, ntcElement, n, for industry ClassificationonTo organize the coded elements, netcIs an electrical property element, nmlAs a metering point level element, nmmIs a metering mode element;
computation information module LcalThe method comprises the following steps:
Lcal={ncc,nmc,nvc,nmvc,nmrc,ntf,nctf,nvtf} (33)
wherein n isccAs contract capacity element, nmcFor measuring dot capacity elements, nvcFor the user voltage class element, nmvcIs rated voltage element, n, of an electricity metermrcFor rated current element, n, of an electricity metertfIs a comprehensive multiplying power element, nctfIs a multiplying power element n of the current transformervtfIs a multiplying power element of the voltage transformer;
time-varying information module CtvThe inclusion terms of (t) are:
Ctv(t)={I(t),U(t),P(t),E(t),L(t)} (34)
wherein, I (t) is a current curve element, U (t) is a voltage curve element, P (t) is a power curve element, E (t) is a daily frozen electric quantity curve element, and L (t) is a line loss curve element;
calibration information module TagContaining item exception class element Ttype
Tag={Ttype} (35)
The exception category elements include: no abnormity, no pressure loss, no current loss, series connection and three-phase unbalance.
Further, the method for judging the support degree of the abnormal electricity utilization analysis data set of the large user to obtain a plurality of frequent item sets after screening comprises the following steps:
extracting candidate 1 item set from abnormal electricity utilization analysis data set S of large user
Figure BDA0002756222960000021
And calculating the support degree:
Figure BDA0002756222960000022
wherein, 1 item candidate set
Figure BDA0002756222960000023
Representing a set of items containing only 1 element
Figure BDA0002756222960000024
Figure BDA0002756222960000025
The mth element in the xth user information in the S;
Figure BDA0002756222960000031
is the total number of times of occurrence of all mth information element values of the abnormal electricity consumption analysis data set S,
Figure BDA0002756222960000032
finger-shaped
Figure BDA0002756222960000033
In candidate 1 item set
Figure BDA0002756222960000034
The number of times of occurrence of (a),
candidate 1 item set
Figure BDA0002756222960000035
The medium support degree is larger than the minimum frequency threshold SminCorresponding to
Figure BDA0002756222960000036
Reserving to obtain frequent 1 item set
Figure BDA0002756222960000037
Figure BDA0002756222960000038
Extracting candidate 2-item set
Figure BDA0002756222960000039
And calculating the support degree:
Figure BDA00027562229600000310
candidate 2 item set
Figure BDA00027562229600000311
Representation collection
Figure BDA00027562229600000312
Figure BDA00027562229600000313
Is the l element in the x user information in S;
Figure BDA00027562229600000314
to represent
Figure BDA00027562229600000315
And
Figure BDA00027562229600000316
in candidate 2 item set
Figure BDA00027562229600000317
The number of simultaneous occurrences in the process, and 2 sets of frequent items are screened
Figure BDA00027562229600000318
The requirements are satisfied:
Figure BDA00027562229600000319
simultaneously, the following requirements are met:
Figure BDA00027562229600000320
Figure BDA00027562229600000321
is a 2 item set
Figure BDA00027562229600000322
The support degree is greater than the minimum frequency threshold SminThe corresponding candidate 2 item sets, and the 2 item sets which do not contain the frequent 1 item set elements are excluded through the formula (12);
sequentially calculating to obtain a frequent k item set
Figure BDA00027562229600000323
And the k +1 term set does not satisfy the support threshold condition.
Further, the confidence degree judgment is carried out on the abnormal electricity utilization analysis data set through the screened frequent item sets, so that the abnormal electricity utilization analysis data set of the large user with relevance is obtained, and the method comprises the following steps:
confidence from frequent 2 item set
Figure BDA00027562229600000324
Start discrimination for frequent 2 item set
Figure BDA00027562229600000325
A certain set of 2 elements
Figure BDA00027562229600000326
For the first item element
Figure BDA00027562229600000327
The second item of the item set is an element
Figure BDA00027562229600000328
The probability of (2) is the confidence of the 2-term set, and the confidence function of the 2-term set is expressed as:
Figure BDA00027562229600000329
Figure BDA00027562229600000330
representing frequent 2 item sets
Figure BDA00027562229600000331
A certain set of 2 elements
Figure BDA00027562229600000332
The degree of support of (c);
Figure BDA00027562229600000333
representing a first item element
Figure BDA00027562229600000334
The degree of support of (c);
judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 2 item set
Figure BDA0002756222960000041
Figure BDA0002756222960000042
Sequentially judging to obtain a strongly correlated frequent k item set
Figure BDA0002756222960000043
And further obtaining a large user abnormal electricity utilization analysis data set with relevance.
An association mining device for a large-user abnormal electricity utilization analysis data set comprises:
the frequent item set screening module is used for judging the support degree of the abnormal electricity utilization analysis data set of the large user to obtain a plurality of screened frequent item sets;
and the relevance mining module is used for carrying out confidence judgment on the abnormal electricity utilization analysis data set through the screened multiple frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance.
Further, the large user abnormal electricity utilization analysis data set S is composed of n user information:
S={S1,S2,...,Sx,...,Sn} (43)
where x is 1,2, …, n, the xth user information SxComprises the following steps:
Sx={Lid,Lcl,Lcal,Ctv(t),Tag} (44)
wherein L isidFor locating information modules, LclIs a category information module, LcalFor calculating information modules, Ctv(T) is a time-varying information module, T is time, TagIs a calibration information module.
Further, the location information module LidThe method is used for marking various types of positioning information of the user and comprises the following steps:
Lid={nan,ncn,nid,nea,nlo,noi} (45)
wherein n isanNumbering elements, n, for the electric meter bureaucnNumbering elements, n, for usersidIs a user name element, neaAs a power-consuming address element, nloNumbering elements for lines, noiIdentity information elements of the collected object;
category information module LclThe method comprises the following steps:
Lcl={ncsc,nwm,nsc,ntc,non,netc,nml,nmm} (46)
wherein n iscscFor user type elementElement, nwmIs a wiring mode element, nscAs an element of the electric meter category, ntcElement, n, for industry ClassificationonTo organize the coded elements, netcIs an electrical property element, nmlAs a metering point level element, nmmIs a metering mode element;
computation information module LcalThe method comprises the following steps:
Lcal={ncc,nmc,nvc,nmvc,nmrc,ntf,nctf,nvtf} (47)
wherein n isccAs contract capacity element, nmcFor measuring dot capacity elements, nvcFor the user voltage class element, nmvcIs rated voltage element, n, of an electricity metermrcFor rated current element, n, of an electricity metertfIs a comprehensive multiplying power element, nctfIs a multiplying power element n of the current transformervtfIs a multiplying power element of the voltage transformer;
time-varying information module CtvThe inclusion terms of (t) are:
Ctv(t)={I(t),U(t),P(t),E(t),L(t)} (48)
wherein, I (t) is a current curve element, U (t) is a voltage curve element, P (t) is a power curve element, E (t) is a daily frozen electric quantity curve element, and L (t) is a line loss curve element;
calibration information module TagContaining item exception class element Ttype
Tag={Ttype} (49)
The exception category elements include: no abnormity, no pressure loss, no current loss, series connection and three-phase unbalance.
Further, the method for judging the support degree of the abnormal electricity utilization analysis data set of the large user to obtain a plurality of frequent item sets after screening comprises the following steps:
extracting candidate 1 item set from abnormal electricity utilization analysis data set S of large user
Figure BDA0002756222960000051
And calculating the support degree:
Figure BDA0002756222960000052
wherein, 1 item candidate set
Figure BDA00027562229600000524
Representing a set of items containing only 1 element
Figure BDA0002756222960000053
Figure BDA0002756222960000054
The mth element in the xth user information in the S;
Figure BDA0002756222960000055
is the total number of times of occurrence of all mth information element values of the abnormal electricity consumption analysis data set S,
Figure BDA0002756222960000056
finger-shaped
Figure BDA0002756222960000057
In candidate 1 item set
Figure BDA0002756222960000058
The number of times of occurrence of (a),
candidate 1 item set
Figure BDA0002756222960000059
The medium support degree is larger than the minimum frequency threshold SminCorresponding to
Figure BDA00027562229600000510
Reserving to obtain frequent 1 item set
Figure BDA00027562229600000511
Figure BDA00027562229600000512
Extracting candidate 2-item set
Figure BDA00027562229600000513
And calculating the support degree:
Figure BDA00027562229600000514
candidate 2 item set
Figure BDA00027562229600000515
Representation collection
Figure BDA00027562229600000516
Figure BDA00027562229600000517
Is the l element in the x user information in S;
Figure BDA00027562229600000518
to represent
Figure BDA00027562229600000519
And
Figure BDA00027562229600000520
in candidate 2 item set
Figure BDA00027562229600000521
The number of simultaneous occurrences in the process, and 2 sets of frequent items are screened
Figure BDA00027562229600000522
The requirements are satisfied:
Figure BDA00027562229600000523
simultaneously, the following requirements are met:
Figure BDA0002756222960000061
Figure BDA0002756222960000062
is a 2 item set
Figure BDA0002756222960000063
The support degree is greater than the minimum frequency threshold SminThe corresponding candidate 2 item sets, and the 2 item sets which do not contain the frequent 1 item set elements are excluded through the formula (12);
sequentially calculating to obtain a frequent k item set
Figure BDA0002756222960000064
And the k +1 term set does not satisfy the support threshold condition.
Further, the confidence degree judgment is carried out on the abnormal electricity utilization analysis data set through the screened frequent item sets, so that the abnormal electricity utilization analysis data set of the large user with relevance is obtained, and the method comprises the following steps:
confidence from frequent 2 item set
Figure BDA0002756222960000065
Start discrimination for frequent 2 item set
Figure BDA0002756222960000066
A certain set of 2 elements
Figure BDA0002756222960000067
For the first item element
Figure BDA0002756222960000068
The second item of the item set is an element
Figure BDA0002756222960000069
The probability of (2) is the confidence of the 2-term set, and the confidence function of the 2-term set is expressed as:
Figure BDA00027562229600000610
Figure BDA00027562229600000611
representing frequent 2 item sets
Figure BDA00027562229600000612
A certain set of 2 elements
Figure BDA00027562229600000613
The degree of support of (c);
Figure BDA00027562229600000614
representing a first item element
Figure BDA00027562229600000615
The degree of support of (c);
judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 2 item set
Figure BDA00027562229600000616
Figure BDA00027562229600000617
Sequentially judging to obtain a strongly correlated frequent k item set
Figure BDA00027562229600000618
And further obtaining a large user abnormal electricity utilization analysis data set with relevance.
The invention has the beneficial effects that: according to the invention, through discrimination of the support degree and the confidence degree, the relevance among the abnormal electricity utilization analysis data set item sets of the large users is excavated, the operation data volume of the abnormal electricity utilization analysis algorithm is reduced, the system analysis efficiency is effectively improved, and the method has a good application prospect.
Detailed Description
The present invention is further described below.
At present, abnormal user analysis data is mainly distributed in a power utilization information acquisition system, a marketing system and the like, and in order to promote data sharing and improve data interaction efficiency in a unified manner, a marketing basic data platform needs to be constructed, wherein the marketing basic data platform comprises data for use and acquisition, marketing data, negative control data, event data and the like related to abnormal power utilization analysis and power utilization behavior diagnosis, so that the requirement of a deep neural network on sample data is met, and the characteristics of electric quantity, power consumption and the like of abnormal power utilization behaviors are comprehensively reflected.
Example 1:
a relevance mining method for a large user abnormal electricity utilization analysis data set is used for carrying out relevance analysis and pruning processing on the large user abnormal electricity utilization analysis data set, and comprises the following steps:
the method comprises the following steps: scanning a large user abnormal electricity utilization analysis data set S, wherein the S is composed of n user information:
S={S1,S2,...,Sx,...,Sn} (57)
where x is 1,2, …, n, the xth user information SxIncludes a plurality of module information:
Sx={Lid,Lcl,Lcal,Ctv(t),Tag} (58)
wherein L isidFor locating information modules, LclIs a category information module, LcalFor calculating information modules, Ctv(T) is a time-varying information module, T is time, TagIs a calibration information module.
Location information module LidThe device consists of a plurality of encryption desensitization information elements and is used for marking various types of positioning information of users, and the encryption desensitization information elements comprise the following items:
Lid={nan,ncn,nid,nea,nlo,noi} (59)
wherein n isanNumbering elements, n, for the electric meter bureaucnNumbering elements, n, for usersidIs a user name element, neaAs a power-consuming address element, nloNumbering elements for lines, noiFor collecting object identity information elements.
Category information module LclWhich comprises the items:
Lcl={ncsc,nwm,nsc,ntc,non,netc,nml,nmm} (60)
wherein n iscscAs a user type element, nwmIs a wiring mode element, nscAs an element of the electric meter category, ntcElement, n, for industry ClassificationonTo organize the coded elements, netcIs an electrical property element, nmlAs a metering point level element, nmmIs a metering mode element.
Computation information module LcalWhich comprises the items:
Lcal={ncc,nmc,nvc,nmvc,nmrc,ntf,nctf,nvtf} (61)
wherein n isccAs contract capacity element, nmcFor measuring dot capacity elements, nvcFor the user voltage class element, nmvcIs rated voltage element, n, of an electricity metermrcFor rated current element, n, of an electricity metertfIs a comprehensive multiplying power element, nctfIs a multiplying power element n of the current transformervtfIs a multiplying power element of the voltage transformer.
Time-varying information module CtvThe inclusion terms of (t) are:
Ctv(t)={I(t),U(t),P(t),E(t),L(t)} (62)
wherein, I (t) is a current curve element, U (t) is a voltage curve element, P (t) is a power curve element, E (t) is a daily frozen electric quantity curve element, and L (t) is a line loss curve element.
Calibration information module TagContaining item exception class element Ttype
Tag={Ttype} (63)
The exception category elements include: no anomaly, no voltage drop, no current drop, cascade, three-phase imbalance, etc.
Step two: carrying out support degree judgment on the abnormal electricity utilization analysis data set to obtain a plurality of screened frequent item sets;
positioning information module L for analyzing each user of data set S according to abnormal power utilizationidClass information module LclAnd a calculation information module LcalTime-varying information module Ctv(T), calibration information Module TagAll contained items
Figure BDA0002756222960000081
To extract a candidate 1 item set
Figure BDA0002756222960000082
And calculating the support degree:
Figure BDA0002756222960000083
wherein the content of the first and second substances,
Figure BDA0002756222960000084
the m-th element in the xth user information in the S, wherein the m maximum value is the total number of element types of the information module, namely the total number of the element types in the five information modules; candidate 1 item set
Figure BDA0002756222960000085
Representing a set of items containing only 1 element
Figure BDA0002756222960000086
For example, for the current transformer multiplying power element set {10/5,50/5,100/5,50/5,100/5 };
Figure BDA0002756222960000087
the total times of all the mth information element values of the abnormal electricity utilization analysis data set S, for example, the total times of the multiplying power elements of the current transformer is 5,
Figure BDA0002756222960000088
finger-shaped
Figure BDA0002756222960000089
In the candidate1 item set
Figure BDA00027562229600000810
If the current transformer multiplying power is 50/5, the supporting degree is 2/5. Candidate 1 item set
Figure BDA00027562229600000811
The medium support degree is larger than the minimum frequency threshold SminCorresponding to
Figure BDA00027562229600000812
Reserving and selecting a frequent 1 item set
Figure BDA00027562229600000813
Those meaningless item set rules are filtered out.
Figure BDA00027562229600000814
SminIs the minimum frequency threshold.
Scanning a large user abnormal electricity utilization analysis data set S, and extracting a candidate 2 item set
Figure BDA00027562229600000815
Calculating its support
Figure BDA00027562229600000816
Candidate 2 item set
Figure BDA00027562229600000817
Representation collection
Figure BDA00027562229600000818
Figure BDA00027562229600000819
Is the l element in the x user information in S;
Figure BDA00027562229600000820
to represent
Figure BDA00027562229600000821
And
Figure BDA00027562229600000822
in candidate 2 item set
Figure BDA00027562229600000823
In the case of simultaneous occurrence of events, e.g. SmIs a current transformer multiplying power element set {10/5,50/5,100/5,50/5,100/5}, SlIs an abnormal class element set { no abnormality, no flow, no pressure },
Figure BDA0002756222960000091
the number of occurrences of {50/5, loss } in (b) is 2, the support degree is 2/5;
screening frequent 2 item set
Figure BDA0002756222960000092
First of all, it is necessary to satisfy
Figure BDA0002756222960000093
At the same time need to satisfy
Figure BDA0002756222960000094
The 2-item set that does not contain frequent 1-item set elements is excluded by equation (12). E.g., frequent 2 item set after screening
Figure BDA0002756222960000095
Is {50/5, lost flow }, {50/5, lost flow } }.
Scanning a large user abnormal electricity utilization analysis data set S, and extracting a candidate 3 item set
Figure BDA0002756222960000096
ComputingDegree of support thereof
Figure BDA0002756222960000097
Screening frequent 3 item set
Figure BDA0002756222960000098
First of all, it is necessary to satisfy
Figure BDA0002756222960000099
At the same time need to satisfy
Figure BDA00027562229600000910
Calculating a frequent 4 item set according to the steps
Figure BDA00027562229600000911
Frequent 5 item set
Figure BDA00027562229600000912
Set of k items in order to frequency
Figure BDA00027562229600000913
And the k +1 term set does not satisfy the support threshold condition.
Step three: and performing confidence judgment on the abnormal electricity utilization analysis data set through the screened frequent item sets to obtain an abnormal electricity utilization analysis data set with relevance.
And judging the probability of the simultaneous occurrence of the elements of the screened frequent item sets according to the support degree of the abnormal electricity utilization analysis data set, and measuring the accuracy of the mining information of the item sets.
Confidence from frequent 2 item set
Figure BDA00027562229600000914
Start discrimination for frequent 2 item set
Figure BDA00027562229600000915
A certain set of 2 elements
Figure BDA00027562229600000916
For the first item element
Figure BDA00027562229600000917
The second item of the item set is an element
Figure BDA00027562229600000918
The probability of (2) is the confidence of this 2-term set, and the confidence function of the 2-term set can be expressed as:
Figure BDA00027562229600000919
Figure BDA00027562229600000920
representing frequent 2 item sets
Figure BDA00027562229600000921
A certain set of 2 elements
Figure BDA0002756222960000101
The degree of support of (c);
Figure BDA0002756222960000102
representing a first item element
Figure BDA0002756222960000103
The degree of support of (c);
for example, for
Figure BDA0002756222960000104
Figure BDA0002756222960000105
Figure BDA0002756222960000106
Figure BDA0002756222960000107
Judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 2 item set
Figure BDA0002756222960000108
Figure BDA0002756222960000109
For frequent 3 item set
Figure BDA00027562229600001010
A certain set of 3 elements of
Figure BDA00027562229600001011
For the first item element
Figure BDA00027562229600001012
The second and third elements of the item set are simultaneously
Figure BDA00027562229600001013
The probability is the confidence of the 3 term set, and the confidence function of the frequent 3 term set can be expressed as:
Figure BDA00027562229600001014
Figure BDA00027562229600001015
representing frequent 3 item sets
Figure BDA00027562229600001016
A certain set of 3 elements of
Figure BDA00027562229600001017
The degree of support of (c);
Figure BDA00027562229600001018
presentation element
Figure BDA00027562229600001019
The degree of support of (c); judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 3 item set
Figure BDA00027562229600001020
Figure BDA00027562229600001021
Judging to obtain a strong correlation frequent 4 item set according to the steps
Figure BDA00027562229600001022
Strongly correlated frequent 5 item set
Figure BDA00027562229600001023
Frequent k item set in sequence to strong correlation
Figure BDA00027562229600001024
Strong correlation frequent 2 item set obtained by the method
Figure BDA00027562229600001025
Strongly correlated frequent 3 item set
Figure BDA00027562229600001026
Strongly correlated frequent k term set
Figure BDA00027562229600001027
The elements of each item set are strongly correlated, and data items which are meaningless to abnormal electricity utilization analysis are removed, so that the data items can be used as basic data of the abnormal electricity utilization analysis.
In conclusion, the relevance mining method for the abnormal electricity utilization analysis data set of the large user is constructed, the relevance relation of the internal elements of the abnormal electricity utilization analysis data set is mined through the support degree and the confidence degree judgment, the meaningless relevance rule is removed, the calculated amount of a subsequent data analysis algorithm is effectively reduced, and the method has a good application prospect.
Example 2:
an association mining device for a large-user abnormal electricity utilization analysis data set comprises:
the frequent item set screening module is used for judging the support degree of the abnormal electricity utilization analysis data set of the large user to obtain a plurality of screened frequent item sets;
and the relevance mining module is used for carrying out confidence judgment on the abnormal electricity utilization analysis data set through the screened multiple frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance.
Further, the large user abnormal electricity utilization analysis data set S is composed of n user information:
S={S1,S2,...,Sx,...,Sn} (76)
where x is 1,2, …, n, the xth user information SxComprises the following steps:
Sx={Lid,Lcl,Lcal,Ctv(t),Tag} (77)
wherein L isidFor locating information modules, LclIs a category information module, LcalFor calculating information modules, Ctv(T) is a time-varying information module, T is time, TagIs a calibration information module.
Further, the location information module LidThe method is used for marking various types of positioning information of the user and comprises the following steps:
Lid={nan,ncn,nid,nea,nlo,noi} (78)
wherein n isanNumbering elements, n, for the electric meter bureaucnNumbering elements, n, for usersidIs a user name element, neaAs a power-consuming address element, nloNumbering elements for lines, noiIdentity information elements of the collected object;
category information module LclThe method comprises the following steps:
Lcl={ncsc,nwm,nsc,ntc,non,netc,nml,nmm} (79)
wherein n iscscAs a user type element, nwmIs a wiring mode element, nscAs an element of the electric meter category, ntcElement, n, for industry ClassificationonTo organize the coded elements, netcIs an electrical property element, nmlAs a metering point level element, nmmIs a metering mode element;
computation information module LcalThe method comprises the following steps:
Lcal={ncc,nmc,nvc,nmvc,nmrc,ntf,nctf,nvtf} (80)
wherein n isccAs contract capacity element, nmcFor measuring dot capacity elements, nvcFor the user voltage class element, nmvcIs rated voltage element, n, of an electricity metermrcFor rated current element, n, of an electricity metertfIs a comprehensive multiplying power element, nctfIs a multiplying power element n of the current transformervtfIs a multiplying power element of the voltage transformer;
time-varying information module CtvThe inclusion terms of (t) are:
Ctv(t)={I(t),U(t),P(t),E(t),L(t)} (81)
wherein, I (t) is a current curve element, U (t) is a voltage curve element, P (t) is a power curve element, E (t) is a daily frozen electric quantity curve element, and L (t) is a line loss curve element;
calibration information module TagContaining item exception class element Ttype
Tag={Ttype} (82)
The exception category elements include: no abnormity, no pressure loss, no current loss, series connection and three-phase unbalance.
Further, the method for judging the support degree of the abnormal electricity utilization analysis data set of the large user to obtain a plurality of frequent item sets after screening comprises the following steps:
extracting candidate 1 item set from abnormal electricity utilization analysis data set S of large user
Figure BDA0002756222960000121
And calculating the support degree:
Figure BDA0002756222960000122
wherein, 1 item candidate set
Figure BDA0002756222960000123
Representing a set of items containing only 1 element
Figure BDA0002756222960000124
Figure BDA0002756222960000125
The mth element in the xth user information in the S;
Figure BDA0002756222960000126
is the total number of times of occurrence of all mth information element values of the abnormal electricity consumption analysis data set S,
Figure BDA0002756222960000127
finger-shaped
Figure BDA0002756222960000128
In candidate 1 item set
Figure BDA0002756222960000129
The number of times of occurrence of (a),
candidate 1 item set
Figure BDA00027562229600001210
The medium support degree is larger than the minimum frequency threshold SminCorresponding to
Figure BDA00027562229600001211
Reserve to obtainTo frequent 1 item set
Figure BDA00027562229600001212
Figure BDA00027562229600001213
Extracting candidate 2-item set
Figure BDA00027562229600001214
And calculating the support degree:
Figure BDA00027562229600001215
candidate 2 item set
Figure BDA00027562229600001216
Representation collection
Figure BDA00027562229600001217
Figure BDA00027562229600001218
Is the l element in the x user information in S;
Figure BDA00027562229600001219
to represent
Figure BDA00027562229600001220
And
Figure BDA00027562229600001221
in candidate 2 item set
Figure BDA00027562229600001222
The number of simultaneous occurrences in the process, and 2 sets of frequent items are screened
Figure BDA00027562229600001223
The requirements are satisfied:
Figure BDA00027562229600001224
at the same time satisfy
Figure BDA00027562229600001225
Figure BDA00027562229600001226
Is a 2 item set
Figure BDA00027562229600001227
The support degree is greater than the minimum frequency threshold SminThe corresponding candidate 2 item sets, and the 2 item sets which do not contain the frequent 1 item set elements are excluded through the formula (12);
sequentially calculating to obtain a frequent k item set
Figure BDA00027562229600001228
And the k +1 term set does not satisfy the support threshold condition.
Further, the confidence degree judgment is carried out on the abnormal electricity utilization analysis data set through the screened frequent item sets, so that the abnormal electricity utilization analysis data set of the large user with relevance is obtained, and the method comprises the following steps:
confidence from frequent 2 item set
Figure BDA00027562229600001229
Start discrimination for frequent 2 item set
Figure BDA00027562229600001230
A certain set of 2 elements
Figure BDA0002756222960000131
For the first item element
Figure BDA0002756222960000132
The second item of the item set is an element
Figure BDA0002756222960000133
The probability of (2) is the confidence of the 2-term set, and the confidence function of the 2-term set is expressed as:
Figure BDA0002756222960000134
Figure BDA0002756222960000135
representing frequent 2 item sets
Figure BDA0002756222960000136
A certain set of 2 elements
Figure BDA0002756222960000137
The degree of support of (c);
Figure BDA0002756222960000138
representing a first item element
Figure BDA0002756222960000139
The degree of support of (c);
judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 2 item set
Figure BDA00027562229600001310
Figure BDA00027562229600001311
Sequentially judging to obtain a strongly correlated frequent k item set
Figure BDA00027562229600001312
And further obtaining a large user abnormal electricity utilization analysis data set with relevance.
The foregoing illustrates and describes the principles, general features, and characteristics of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. A relevance mining method for a large-user abnormal electricity utilization analysis data set is characterized by comprising the following steps:
carrying out support degree judgment on the abnormal electricity utilization analysis data set of the large user to obtain a plurality of screened frequent item sets;
and performing confidence judgment on the abnormal electricity utilization analysis data set through the screened frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance.
2. The correlation mining method for the large-user abnormal electricity utilization analysis data set according to claim 1, characterized in that: the abnormal electricity utilization analysis data set S of the large user consists of n user information:
S={S1,S2,...,Sx,...,Sn} (1)
where x is 1,2, …, n, the xth user information SxComprises the following steps:
Sx={Lid,Lcl,Lcal,Ctv(t),Tag} (2)
wherein L isidFor locating information modules, LclIs a category information module, LcalFor calculating information modules, Ctv(T) is a time-varying information module, T is time, TagIs a calibration information module.
3. The correlation mining method for the large-user abnormal electricity utilization analysis data set according to claim 2, characterized in that: location information module LidThe method is used for marking various types of positioning information of the user and comprises the following steps:
Lid={nan,ncn,nid,nea,nlo,noi} (3)
wherein n isanNumbering elements, n, for the electric meter bureaucnNumbering elements, n, for usersidIs a user name element, neaAs a power-consuming address element, nloNumbering elements for lines, noiIdentity information elements of the collected object;
category information module LclThe method comprises the following steps:
Lcl={ncsc,nwm,nsc,ntc,non,netc,nml,nmm} (4)
wherein n iscscAs a user type element, nwmIs a wiring mode element, nscAs an element of the electric meter category, ntcElement, n, for industry ClassificationonTo organize the coded elements, netcIs an electrical property element, nmlAs a metering point level element, nmmIs a metering mode element;
computation information module LcalThe method comprises the following steps:
Lcal={ncc,nmc,nvc,nmvc,nmrc,ntf,nctf,nvtf} (5)
wherein n isccAs contract capacity element, nmcFor measuring dot capacity elements, nvcFor the user voltage class element, nmvcIs rated voltage element, n, of an electricity metermrcFor rated current element, n, of an electricity metertfIs a comprehensive multiplying power element, nctfIs a multiplying power element n of the current transformervtfIs a multiplying power element of the voltage transformer;
time-varying information module CtvThe inclusion terms of (t) are:
Ctv(t)={I(t),U(t),P(t),E(t),L(t)} (6)
wherein, I (t) is a current curve element, U (t) is a voltage curve element, P (t) is a power curve element, E (t) is a daily frozen electric quantity curve element, and L (t) is a line loss curve element;
calibration information module TagContaining item exception class element Ttype
Tag={Ttype} (7)
The exception category elements include: no abnormity, no pressure loss, no current loss, series connection and three-phase unbalance.
4. The correlation mining method for the large-user abnormal electricity utilization analysis data set according to claim 2, characterized in that: carrying out support degree judgment on the abnormal electricity utilization analysis data set of the large user to obtain a plurality of frequent item sets after screening, and the method comprises the following steps:
extracting candidate 1 item set from abnormal electricity utilization analysis data set S of large user
Figure FDA0002756222950000021
And calculating the support degree:
Figure FDA0002756222950000022
wherein, 1 item candidate set
Figure FDA0002756222950000023
Representing a set of items containing only 1 element
Figure FDA0002756222950000024
Figure FDA0002756222950000025
The mth element in the xth user information in the S;
Figure FDA0002756222950000026
is the total number of times of occurrence of all mth information element values of the abnormal electricity consumption analysis data set S,
Figure FDA0002756222950000027
finger-shaped
Figure FDA0002756222950000028
In the candidate1 item set
Figure FDA0002756222950000029
The number of times of occurrence of (a),
candidate 1 item set
Figure FDA00027562229500000210
The medium support degree is larger than the minimum frequency threshold SminCorresponding to
Figure FDA00027562229500000211
Reserving to obtain frequent 1 item set
Figure FDA00027562229500000212
Figure FDA00027562229500000213
Extracting candidate 2-item set
Figure FDA00027562229500000214
And calculating the support degree:
Figure FDA00027562229500000215
candidate 2 item set
Figure FDA00027562229500000216
Representation collection
Figure FDA00027562229500000217
Figure FDA00027562229500000218
Is the l element in the x user information in S;
Figure FDA00027562229500000219
to represent
Figure FDA00027562229500000220
And
Figure FDA00027562229500000221
in candidate 2 item set
Figure FDA00027562229500000222
The number of simultaneous occurrences in the process, and 2 sets of frequent items are screened
Figure FDA00027562229500000223
The requirements are satisfied:
Figure FDA00027562229500000224
simultaneously, the following requirements are met:
Figure FDA0002756222950000031
Figure FDA0002756222950000032
is a 2 item set
Figure FDA0002756222950000033
The support degree is greater than the minimum frequency threshold SminThe corresponding candidate 2 item sets, and the 2 item sets which do not contain the frequent 1 item set elements are excluded through the formula (12);
sequentially calculating to obtain a frequent k item set
Figure FDA0002756222950000034
And the k +1 term set does not satisfy the support threshold condition.
5. The correlation mining method for the large-user abnormal electricity utilization analysis data set according to claim 1, characterized in that: carrying out confidence judgment on the abnormal electricity utilization analysis data set through the screened frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance, wherein the method comprises the following steps:
confidence from frequent 2 item set
Figure FDA0002756222950000035
Start discrimination for frequent 2 item set
Figure FDA0002756222950000036
A certain set of 2 elements
Figure FDA0002756222950000037
For the first item element
Figure FDA0002756222950000038
The second item of the item set is an element
Figure FDA0002756222950000039
The probability of (2) is the confidence of the 2-term set, and the confidence function of the 2-term set is expressed as:
Figure FDA00027562229500000310
Figure FDA00027562229500000311
representing frequent 2 item sets
Figure FDA00027562229500000312
A certain set of 2 elements
Figure FDA00027562229500000313
The degree of support of (c);
Figure FDA00027562229500000314
representing a first item element
Figure FDA00027562229500000315
The degree of support of (c);
judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 2 item set
Figure FDA00027562229500000316
Figure FDA00027562229500000317
Sequentially judging to obtain a strongly correlated frequent k item set
Figure FDA00027562229500000318
And further obtaining a large user abnormal electricity utilization analysis data set with relevance.
6. An association mining device for a large-user abnormal electricity utilization analysis data set is characterized by comprising:
the frequent item set screening module is used for judging the support degree of the abnormal electricity utilization analysis data set of the large user to obtain a plurality of screened frequent item sets;
and the relevance mining module is used for carrying out confidence judgment on the abnormal electricity utilization analysis data set through the screened multiple frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance.
7. The device for mining the relevance of the abnormal electricity utilization analysis data set of the large users according to claim 6, wherein: the abnormal electricity utilization analysis data set S of the large user consists of n user information:
S={S1,S2,...,Sx,...,Sn} (15)
where x is 1,2, …, n, the xth user information SxComprises the following steps:
Sx={Lid,Lcl,Lcal,Ctv(t),Tag} (16)
wherein L isidFor locating information modules, LclIs a category information module, LcalFor calculating information modules, Ctv(T) is a time-varying information module, T is time, TagIs a calibration information module.
8. The device for mining the relevance of the abnormal electricity utilization analysis data set of the large users according to claim 7, wherein: location information module LidThe method is used for marking various types of positioning information of the user and comprises the following steps:
Lid={nan,ncn,nid,nea,nlo,noi} (17)
wherein n isanNumbering elements, n, for the electric meter bureaucnNumbering elements, n, for usersidIs a user name element, neaAs a power-consuming address element, nloNumbering elements for lines, noiIdentity information elements of the collected object;
category information module LclThe method comprises the following steps:
Lcl={ncsc,nwm,nsc,ntc,non,netc,nml,nmm} (18)
wherein n iscscAs a user type element, nwmIs a wiring mode element, nscAs an element of the electric meter category, ntcElement, n, for industry ClassificationonTo organize the coded elements, netcIs an electrical property element, nmlAs a metering point level element, nmmIs a metering mode element;
computation information module LcalThe method comprises the following steps:
Lcal={ncc,nmc,nvc,nmvc,nmrc,ntf,nctf,nvtf} (19)
wherein n isccAs contract capacity element, nmcFor measuring dot capacity elements, nvcFor the user voltage class element, nmvcIs rated voltage element, n, of an electricity metermrcFor rated current element, n, of an electricity metertfIs a comprehensive multiplying power element, nctfIs a multiplying power element n of the current transformervtfIs a multiplying power element of the voltage transformer;
time-varying information module CtvThe inclusion terms of (t) are:
Ctv(t)={I(t),U(t),P(t),E(t),L(t)} (20)
wherein, I (t) is a current curve element, U (t) is a voltage curve element, P (t) is a power curve element, E (t) is a daily frozen electric quantity curve element, and L (t) is a line loss curve element;
calibration information module TagContaining item exception class element Ttype
Tag={Ttype} (21)
The exception category elements include: no abnormity, no pressure loss, no current loss, series connection and three-phase unbalance.
9. The device for mining the relevance of the abnormal electricity utilization analysis data set of the large users according to claim 7, wherein: carrying out support degree judgment on the abnormal electricity utilization analysis data set of the large user to obtain a plurality of frequent item sets after screening, and the method comprises the following steps:
extracting candidate 1 item set from abnormal electricity utilization analysis data set S of large user
Figure FDA0002756222950000051
And calculating the support degree:
Figure FDA0002756222950000052
wherein, 1 item candidate set
Figure FDA0002756222950000053
Representing a set of items containing only 1 element
Figure FDA0002756222950000054
Figure FDA0002756222950000055
The mth element in the xth user information in the S;
Figure FDA0002756222950000056
is the total number of times of occurrence of all mth information element values of the abnormal electricity consumption analysis data set S,
Figure FDA0002756222950000057
finger-shaped
Figure FDA0002756222950000058
In candidate 1 item set
Figure FDA0002756222950000059
The number of times of occurrence of (a),
candidate 1 item set
Figure FDA00027562229500000510
The medium support degree is larger than the minimum frequency threshold SminCorresponding to
Figure FDA00027562229500000511
Reserving to obtain frequent 1 item set
Figure FDA00027562229500000512
Figure FDA00027562229500000513
Extracting candidate 2-item set
Figure FDA00027562229500000514
And calculating the support degree:
Figure FDA00027562229500000515
candidate 2 item set
Figure FDA00027562229500000516
Representation collection
Figure FDA00027562229500000517
Figure FDA00027562229500000518
Is the l element in the x user information in S;
Figure FDA00027562229500000519
to represent
Figure FDA00027562229500000520
And
Figure FDA00027562229500000521
in candidate 2 item set
Figure FDA00027562229500000522
The number of simultaneous occurrences in the process, and 2 sets of frequent items are screened
Figure FDA00027562229500000523
The requirements are satisfied:
Figure FDA00027562229500000524
simultaneously, the following requirements are met:
Figure FDA00027562229500000525
Figure FDA00027562229500000526
is a 2 item set
Figure FDA00027562229500000527
The support degree is greater than the minimum frequency threshold SminThe corresponding candidate 2 item sets, and the 2 item sets which do not contain the frequent 1 item set elements are excluded through the formula (12);
sequentially calculating to obtain a frequent k item set
Figure FDA00027562229500000528
And the k +1 term set does not satisfy the support threshold condition.
10. The device for mining the relevance of the abnormal electricity utilization analysis data set of the large users according to claim 6, wherein: carrying out confidence judgment on the abnormal electricity utilization analysis data set through the screened frequent item sets to obtain a large-user abnormal electricity utilization analysis data set with relevance, wherein the method comprises the following steps:
confidence from frequent 2 item set
Figure FDA00027562229500000529
Start discrimination for frequent 2 item set
Figure FDA00027562229500000530
A certain set of 2 elements
Figure FDA00027562229500000531
For the first item element
Figure FDA00027562229500000532
The second item of the item set is an element
Figure FDA00027562229500000533
The probability of (2) is the confidence of the 2-term set, and the confidence function of the 2-term set is expressed as:
Figure FDA0002756222950000061
Figure FDA0002756222950000062
representing frequent 2 item sets
Figure FDA0002756222950000063
A certain set of 2 elements
Figure FDA0002756222950000064
The degree of support of (c);
Figure FDA0002756222950000065
representing a first item element
Figure FDA0002756222950000066
The degree of support of (c);
judging whether the minimum confidence coefficient threshold value C is metminDetermining a strongly correlated frequent 2 item set
Figure FDA0002756222950000067
Figure FDA0002756222950000068
Sequentially judging to obtain a strongly correlated frequent k item set
Figure FDA0002756222950000069
And further obtaining a large user abnormal electricity utilization analysis data set with relevance.
CN202011203430.5A 2020-11-02 2020-11-02 Relevance mining method and device for abnormal electricity utilization analysis data set of large user Pending CN112330136A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011203430.5A CN112330136A (en) 2020-11-02 2020-11-02 Relevance mining method and device for abnormal electricity utilization analysis data set of large user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011203430.5A CN112330136A (en) 2020-11-02 2020-11-02 Relevance mining method and device for abnormal electricity utilization analysis data set of large user

Publications (1)

Publication Number Publication Date
CN112330136A true CN112330136A (en) 2021-02-05

Family

ID=74324244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011203430.5A Pending CN112330136A (en) 2020-11-02 2020-11-02 Relevance mining method and device for abnormal electricity utilization analysis data set of large user

Country Status (1)

Country Link
CN (1) CN112330136A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878908A (en) * 2023-01-09 2023-03-31 华南理工大学 Social network influence maximization method and system based on graph attention machine mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李智雄等: ""高速公路用电行为异常检测算法及预警模型研究"", 《交通世界》 *
段晓萌等: ""基于FP-growth算法的用电异常数据挖掘方法"", 《电子技术应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878908A (en) * 2023-01-09 2023-03-31 华南理工大学 Social network influence maximization method and system based on graph attention machine mechanism
CN115878908B (en) * 2023-01-09 2023-06-02 华南理工大学 Social network influence maximization method and system of graph annotation meaning force mechanism

Similar Documents

Publication Publication Date Title
CN110223196B (en) Anti-electricity-stealing analysis method based on typical industry feature library and anti-electricity-stealing sample library
CN110097297B (en) Multi-dimensional electricity stealing situation intelligent sensing method, system, equipment and medium
CN110781332A (en) Electric power resident user daily load curve clustering method based on composite clustering algorithm
CN111738462B (en) Fault first-aid repair active service early warning method for electric power metering device
CN107230013B (en) Method for identifying abnormal power consumption and time positioning of distribution network users under unsupervised learning
CN110930198A (en) Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
CN109409444B (en) Multivariate power grid fault type discrimination method based on prior probability
CN114519514B (en) Low-voltage transformer area reasonable line loss value measuring and calculating method, system and computer equipment
CN109947815B (en) Power theft identification method based on outlier algorithm
CN114611738A (en) Load prediction method based on user electricity consumption behavior analysis
CN112184489A (en) Power consumer grouping management system and method
CN111191909A (en) Electricity stealing identification system based on data analysis of typical electricity stealing industry and historical electricity stealing sample library
CN111914942A (en) Multi-table-combined one-use energy anomaly analysis method
CN106846170B (en) Generator set trip monitoring method and monitoring device thereof
CN108596227A (en) A kind of leading influence factor method for digging of user power utilization behavior
CN112330136A (en) Relevance mining method and device for abnormal electricity utilization analysis data set of large user
Grigoras et al. Processing of smart meters data for peak load estimation of consumers
CN117277566B (en) Power grid data analysis power dispatching system and method based on big data
CN111861587A (en) System and method for analyzing residential electricity consumption behavior based on hidden Markov model and forward algorithm
CN111639792A (en) Method for intelligently adding bank ATM (automatic teller machine) money based on artificial intelligence
CN114372835B (en) Comprehensive energy service potential customer identification method, system and computer equipment
CN112924743B (en) Instrument state detection method based on current data
CN112487991B (en) High-precision load identification method and system based on characteristic self-learning
CN115147242A (en) Power grid data management system based on data mining
CN114066219A (en) Electricity stealing analysis method for intelligently identifying electricity utilization abnormal points under incidence matrix

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210205