CN108022179B - Suspected electricity larceny subject factor determination method based on chi-square test - Google Patents

Suspected electricity larceny subject factor determination method based on chi-square test Download PDF

Info

Publication number
CN108022179B
CN108022179B CN201711200339.6A CN201711200339A CN108022179B CN 108022179 B CN108022179 B CN 108022179B CN 201711200339 A CN201711200339 A CN 201711200339A CN 108022179 B CN108022179 B CN 108022179B
Authority
CN
China
Prior art keywords
factors
correlation
chi
factor
electricity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711200339.6A
Other languages
Chinese (zh)
Other versions
CN108022179A (en
Inventor
李思韬
阿辽沙·叶
张蕊
窦健
邵强
张海龙
王玮
卢继哲
卞晖
王帆
郑国权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Center Of Metrology Co ltd
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Fujian Electric Power Co Ltd
Fuzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd
Original Assignee
State Grid Center Of Metrology Co ltd
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Fujian Electric Power Co Ltd
Fuzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Center Of Metrology Co ltd, State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI, State Grid Fujian Electric Power Co Ltd, Fuzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd filed Critical State Grid Center Of Metrology Co ltd
Priority to CN201711200339.6A priority Critical patent/CN108022179B/en
Publication of CN108022179A publication Critical patent/CN108022179A/en
Application granted granted Critical
Publication of CN108022179B publication Critical patent/CN108022179B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention discloses a method for determining suspected electricity larceny subject factors based on chi-square test, which takes correlation factors and classification variables to be determined as classification variables, verifies the correlation degree of factors by using the chi-square test mode, eliminates the factors with too low correlation degree with results, reserves the strongest correlation with results in a plurality of high correlation factors, comprehensively considers the correlation condition of combined factors, establishes a suspected electricity larceny analysis model based on the high correlation factor, and reduces the workload of electricity larceny inspectors for checking the electricity larceny factors.

Description

Suspected electricity larceny subject factor determination method based on chi-square test
Technical Field
The invention relates to a method for constructing a suspected electricity larceny theme model, in particular to a method for determining suspected electricity larceny theme factors based on chi-square test.
Background
In recent years, the characteristics of high technology of electricity stealing means, concealed electricity stealing process, frequent electricity stealing behavior and the like, which are presented by the electricity stealing problem, cause difficulty for the power grid enterprises to identify electricity stealing users, so that effective means and methods for helping the power grid enterprises accurately identify the electricity stealing users are needed, and the traditional method for determining the suspected electricity stealing users is poor in timeliness and low in accuracy and is not matched with the rapid development of the current power grid.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides the system for detecting the characteristic factors of the electricity larceny user by analyzing the user behavior data, so that the accuracy of electricity inspection is improved, and the working pressure of the electricity inspection personnel for inspecting a large number of suspected electricity larceny related factors is reduced.
A method for determining suspected electricity larceny subject factor based on chi-square test includes (1) calculating chi-square value of each initial factor and result by chi-square test, wherein the calculation formula is as follows:wherein A is four values in an actual value four-grid table, T is four values in a theoretical value four-grid table, then a critical probability table is checked by inquiring a chi-square to obtain a relevant probability, and a factor larger than a threshold value is taken as a relevant factor;
(2) Removing the correlation factors with large correlation, and calculating the correlation among the correlation factors by adopting the Pelson coefficient, wherein the calculation formula is as follows:if the correlation of the two factors is larger than a set threshold, removing one of the two factors with small correlation probability, wherein the correlation factor can be generally measured between 0.8 and 0.9 according to the actual factor number;
(3) Determining a combination association factor, combining according to a maximum combination threshold from irrelevant factors, checking and verifying the association degree of each pair of combination factors and results by using a chi-square test, and adding the combination factor into a relevant factor queue if the association probability is greater than a given threshold;
(4) The related factors obtained in the steps (2) and (3) are suspected electricity larceny subject factors.
The method applies chi-square test to the selection of factors influencing the electricity stealing behavior of the user, has good adaptability, and has high superiority when the chi-square test is used for comparing the relevance of two or more samples and classification variables. Meanwhile, in order to keep the independence among the factors, the relevance among the factors is calculated by adopting the Pelson coefficients, and the factors with large relevance are removed. Considering that the combination factors can be related to the results, the correlation between the combination factors and the results is calculated by using chi-square test, and the combination factors with high correlation probability are also considered as the correlation factors.
The threshold value can be preferably selected to have a correlation probability of 0.8, and when the number of factors exceeding the correlation probability of 0.8 is more than 90% or less than 30% of the total number of factors, the correlation probability threshold value should be gradually increased or decreased by taking 0.01 as a step length until the number of factors exceeding the correlation probability of 0.8 is less than 90% or more than 30% of the total number of factors.
In summary, compared with the prior art, the invention has the following advantages:
according to the invention, the association factors and the classification variables to be determined are regarded as the classification variables, the association degree of the factors is verified by using the chi-square test mode, the factors with too low association degree with the results are removed, the strongest association with the results in a plurality of high-correlation factors is reserved, the association condition of the combination factors is comprehensively considered, a suspected electricity larceny analysis model based on the high-association factors is established, and the workload of electricity larceny inspectors for inspecting the electricity larceny factors is reduced.
Drawings
FIG. 1 is a flow chart of a suspected electricity theft subject factor determination method based on chi-square test of the present invention.
Detailed Description
The present invention will be described in more detail with reference to examples.
Example 1
A method for determining suspected fraudulent use of electricity based on chi-square test (1) using chi-square test,and calculating chi-square values of all initial factors and results, wherein a calculation formula is as follows:wherein A is four values in an actual value four-grid table, T is four values in a theoretical value four-grid table, then a critical probability table is checked by inquiring a chi-square to obtain a relevant probability, and a factor larger than a threshold value is taken as a relevant factor;
(2) Removing the correlation factors with large correlation, and calculating the correlation among the correlation factors by adopting the Pelson coefficient, wherein the calculation formula is as follows:if the correlation of the two factors is greater than a set threshold, removing one of the two factors with a small correlation probability;
(3) Determining a combination association factor, combining according to a maximum combination threshold from irrelevant factors, checking and verifying the association degree of each pair of combination factors and results by using a chi-square test, and adding the combination factor into a relevant factor queue if the association probability is greater than a given threshold;
(4) The related factors obtained in the steps (2) and (3) are suspected electricity larceny subject factors.
The threshold value can be preferably selected to have a correlation probability of 0.8, and when the number of factors exceeding the correlation probability of 0.8 is more than 90% or less than 30% of the total number of factors, the correlation probability threshold value should be gradually increased or decreased by taking 0.01 as a step length until the number of factors exceeding the correlation probability of 0.8 is less than 90% or more than 30% of the total number of factors.
Wherein the relevance factor can be generally measured between 0.8 and 0.9 according to the actual factor number.
As in fig. 1: firstly, selecting initial correlation factors, then checking the correlation between the calculation factors and the results by using a chi-square, adding the correlation into a correlation factor queue, then calculating the correlation between the factors by using Pelson coefficients, and removing the factors with the high correlation; and then calculating the correlation between the combined factors and the results, adding the combined factors with large correlation into a related factor queue, and finally obtaining all the high-correlation factors.
The beneficial effects of the invention are illustrated by the following specific experiments: the experimental data are formed into a sample by 2000 electricity larceny users selected by a certain electric company of the national network and related electricity utilization data thereof during electricity larceny, 2000 non-electricity larceny users and related electricity utilization data thereof for a period of time. 55 pieces of data are selected as initial relevant factors, and the correlation degree between each suspected factor and the result of whether electricity is stolen is calculated by using chi-square test, so that 22 relevant factors are obtained; then, calculating the correlation between factors by adopting the Pelson coefficient, and if the correlation is larger than a threshold value, removing the factor with small correlation probability in the two factors; and calculating the association probability between the combination factors and the results in the removed 33 pieces of data to obtain 1 pair of combination related factors. Greatly reduces the difficulty of investigation. Thus, the electricity stealing theme model is constructed.
The undescribed portion of this embodiment is identical to the prior art.

Claims (1)

1. The suspected electricity larceny subject factor determination method based on chi-square test is characterized by comprising the following steps:
(1) The selected 2000 electricity stealing users and the related electricity consumption data thereof during electricity stealing period, the 2000 non-electricity stealing users and the related electricity consumption data thereof for a period of time form a sample together; 55 pieces of data are selected from the sample as initial related factors, wherein the initial related factors are suspected factors;
and (3) adopting chi-square test to calculate the association degree of each suspected factor and the result of whether electricity is stolen, namely calculating the chi-square value of each initial related factor and the result, wherein the calculation formula is as follows:wherein A is four values in an actual value four-grid table, T is four values in a theoretical value four-grid table, then a critical probability table is checked by inquiring a chi-square to obtain a relevant probability, and a factor larger than a threshold value is taken as a relevant factor;
(2) Removing the correlation factors with large correlation, and calculating the correlation among the correlation factors by adopting the Pelson coefficient, wherein the calculation formula is as follows:if the correlation of the two factors is greater than a set threshold, removing one of the two factors with a small correlation probability;
(3) Determining a combination association factor, combining according to a maximum combination threshold from irrelevant factors, checking and verifying the association degree of each pair of combination factors and results by using a chi-square test, and adding the combination factor into a relevant factor queue if the association probability is greater than a given threshold;
(4) The related factors obtained in the steps (2) and (3) are suspected electricity larceny subject factors.
CN201711200339.6A 2017-11-20 2017-11-20 Suspected electricity larceny subject factor determination method based on chi-square test Active CN108022179B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711200339.6A CN108022179B (en) 2017-11-20 2017-11-20 Suspected electricity larceny subject factor determination method based on chi-square test

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711200339.6A CN108022179B (en) 2017-11-20 2017-11-20 Suspected electricity larceny subject factor determination method based on chi-square test

Publications (2)

Publication Number Publication Date
CN108022179A CN108022179A (en) 2018-05-11
CN108022179B true CN108022179B (en) 2024-03-26

Family

ID=62077125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711200339.6A Active CN108022179B (en) 2017-11-20 2017-11-20 Suspected electricity larceny subject factor determination method based on chi-square test

Country Status (1)

Country Link
CN (1) CN108022179B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112214527A (en) * 2020-09-25 2021-01-12 桦蓥(上海)信息科技有限责任公司 Financial object abnormal factor screening and analyzing method based on failure response code

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198208A (en) * 2013-03-04 2013-07-10 北京空间飞行器总体设计部 Weight determining method applicable to small subsample condition
CN103473438A (en) * 2013-08-15 2013-12-25 国家电网公司 Method for optimizing and correcting wind power prediction models
CN103559630A (en) * 2013-10-31 2014-02-05 华南师范大学 Customer segmentation method based on customer attribute and behavior characteristic analysis
CN106251049A (en) * 2016-07-25 2016-12-21 国网浙江省电力公司宁波供电公司 A kind of electricity charge risk model construction method of big data
CN107145966A (en) * 2017-04-12 2017-09-08 山大地纬软件股份有限公司 Logic-based returns the analysis and early warning method of opposing electricity-stealing of probability analysis Optimized model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7634360B2 (en) * 2003-09-23 2009-12-15 Prediction Sciences, LL Cellular fibronectin as a diagnostic marker in stroke and methods of use thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198208A (en) * 2013-03-04 2013-07-10 北京空间飞行器总体设计部 Weight determining method applicable to small subsample condition
CN103473438A (en) * 2013-08-15 2013-12-25 国家电网公司 Method for optimizing and correcting wind power prediction models
CN103559630A (en) * 2013-10-31 2014-02-05 华南师范大学 Customer segmentation method based on customer attribute and behavior characteristic analysis
CN106251049A (en) * 2016-07-25 2016-12-21 国网浙江省电力公司宁波供电公司 A kind of electricity charge risk model construction method of big data
CN107145966A (en) * 2017-04-12 2017-09-08 山大地纬软件股份有限公司 Logic-based returns the analysis and early warning method of opposing electricity-stealing of probability analysis Optimized model

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
Minimum redundancy maximum relevance feature selection approach for temporal gene expression data;Radovic, M.等;BMC bioinformatics;20170203;第18卷(第1期);第1-14页 *
基于云平台的文本特征选择算法研究;王军锋;CNKI优秀硕士学位论文全文库;20170415;第2017卷(第4期);正文全文 *
基于互信息的组合特征选择算法;李叶紫等;计算机系统应用;20170831;第26卷(第8期);第173-179页 *
基于混合特征选择的轻度认知功能障碍的诊断分类;郭宏伟等;信息技术与信息化;20151031(第10期);第165-168页 *
基于采集系统的反窃电技术分析及防范措施;王全兴,李思韬;电测与仪表;20161231;第53卷(第07期);第78-83页 *
基于阿尔茨海默病早期预测脑皮层厚度特征选择方法的研究;曹元磊;CNKI优秀硕士学位论文全文库;20170315;第2017卷(第3期);正文全文 *
应用大数据技术的反窃电分析;陈文瑛等;电子测量与仪器学报;20161231;第30卷(第10期);第1558-1567页 *
王晶舒主编.社会调查研究方法.吉林大学出版社,2014,第144-155页. *

Also Published As

Publication number Publication date
CN108022179A (en) 2018-05-11

Similar Documents

Publication Publication Date Title
CN105512799B (en) Power system transient stability evaluation method based on mass online historical data
CN112201260B (en) Transformer running state online detection method based on voiceprint recognition
CN109507554B (en) Electrical equipment insulation state evaluation method
CN103218534B (en) Right tail-truncated type lifetime data distribution selection method
CN111695620A (en) Method and system for detecting and correcting abnormal data of time sequence of power system
CN107808100B (en) Steganalysis method for specific test sample
CN111797887A (en) Anti-electricity-stealing early warning method and system based on density screening and K-means clustering
CN111191671A (en) Electrical appliance waveform detection method and system, electronic equipment and storage medium
CN115166563A (en) Power battery aging state evaluation and decommissioning screening method and system
CN112381351A (en) Power utilization behavior change detection method and system based on singular spectrum analysis
CN108022179B (en) Suspected electricity larceny subject factor determination method based on chi-square test
CN112688324B (en) Power system low-frequency oscillation mode identification method based on FastICA and TLS-ESPRIT
CN110020637A (en) A kind of analog circuit intermittent fault diagnostic method based on more granularities cascade forest
CN108766465B (en) Digital audio tampering blind detection method based on ENF general background model
CN109598245B (en) Edible oil transverse relaxation attenuation curve signal feature extraction method based on 1D-CNN
CN107817784B (en) A kind of procedure failure testing method based on concurrent offset minimum binary
CN106501643B (en) A kind of detection method of the resistance to pulling force of cable connector
CN113014361B (en) BPSK signal confidence test method based on graph
CN110658463B (en) Method for predicting cycle life of lithium ion battery
CN105989095B (en) Take the correlation rule significance test method and device of data uncertainty into account
CN111505445B (en) Credibility detection method and device for mutual-user relationship of transformer area and computer equipment
CN115308644A (en) Transformer winding fault detection method and system based on current offset ratio difference analysis
CN104794290B (en) A kind of modal parameter automatic identifying method for mechanized equipment structure
CN113011261B (en) Sinusoidal signal detection method and device based on graph
CN115578359B (en) Method, system, device and storage medium for detecting defect of few samples based on generation of countermeasure network and defect-free image metric

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant