CN108022179B - Suspected electricity larceny subject factor determination method based on chi-square test - Google Patents
Suspected electricity larceny subject factor determination method based on chi-square test Download PDFInfo
- Publication number
- CN108022179B CN108022179B CN201711200339.6A CN201711200339A CN108022179B CN 108022179 B CN108022179 B CN 108022179B CN 201711200339 A CN201711200339 A CN 201711200339A CN 108022179 B CN108022179 B CN 108022179B
- Authority
- CN
- China
- Prior art keywords
- factors
- correlation
- chi
- factor
- electricity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000005611 electricity Effects 0.000 title claims abstract description 43
- 238000000546 chi-square test Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 title claims abstract description 13
- 238000004364 calculation method Methods 0.000 claims description 7
- KJONHKAYOJNZEC-UHFFFAOYSA-N nitrazepam Chemical compound C12=CC([N+](=O)[O-])=CC=C2NC(=O)CN=C1C1=CC=CC=C1 KJONHKAYOJNZEC-UHFFFAOYSA-N 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 abstract description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
Abstract
The invention discloses a method for determining suspected electricity larceny subject factors based on chi-square test, which takes correlation factors and classification variables to be determined as classification variables, verifies the correlation degree of factors by using the chi-square test mode, eliminates the factors with too low correlation degree with results, reserves the strongest correlation with results in a plurality of high correlation factors, comprehensively considers the correlation condition of combined factors, establishes a suspected electricity larceny analysis model based on the high correlation factor, and reduces the workload of electricity larceny inspectors for checking the electricity larceny factors.
Description
Technical Field
The invention relates to a method for constructing a suspected electricity larceny theme model, in particular to a method for determining suspected electricity larceny theme factors based on chi-square test.
Background
In recent years, the characteristics of high technology of electricity stealing means, concealed electricity stealing process, frequent electricity stealing behavior and the like, which are presented by the electricity stealing problem, cause difficulty for the power grid enterprises to identify electricity stealing users, so that effective means and methods for helping the power grid enterprises accurately identify the electricity stealing users are needed, and the traditional method for determining the suspected electricity stealing users is poor in timeliness and low in accuracy and is not matched with the rapid development of the current power grid.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides the system for detecting the characteristic factors of the electricity larceny user by analyzing the user behavior data, so that the accuracy of electricity inspection is improved, and the working pressure of the electricity inspection personnel for inspecting a large number of suspected electricity larceny related factors is reduced.
A method for determining suspected electricity larceny subject factor based on chi-square test includes (1) calculating chi-square value of each initial factor and result by chi-square test, wherein the calculation formula is as follows:wherein A is four values in an actual value four-grid table, T is four values in a theoretical value four-grid table, then a critical probability table is checked by inquiring a chi-square to obtain a relevant probability, and a factor larger than a threshold value is taken as a relevant factor;
(2) Removing the correlation factors with large correlation, and calculating the correlation among the correlation factors by adopting the Pelson coefficient, wherein the calculation formula is as follows:if the correlation of the two factors is larger than a set threshold, removing one of the two factors with small correlation probability, wherein the correlation factor can be generally measured between 0.8 and 0.9 according to the actual factor number;
(3) Determining a combination association factor, combining according to a maximum combination threshold from irrelevant factors, checking and verifying the association degree of each pair of combination factors and results by using a chi-square test, and adding the combination factor into a relevant factor queue if the association probability is greater than a given threshold;
(4) The related factors obtained in the steps (2) and (3) are suspected electricity larceny subject factors.
The method applies chi-square test to the selection of factors influencing the electricity stealing behavior of the user, has good adaptability, and has high superiority when the chi-square test is used for comparing the relevance of two or more samples and classification variables. Meanwhile, in order to keep the independence among the factors, the relevance among the factors is calculated by adopting the Pelson coefficients, and the factors with large relevance are removed. Considering that the combination factors can be related to the results, the correlation between the combination factors and the results is calculated by using chi-square test, and the combination factors with high correlation probability are also considered as the correlation factors.
The threshold value can be preferably selected to have a correlation probability of 0.8, and when the number of factors exceeding the correlation probability of 0.8 is more than 90% or less than 30% of the total number of factors, the correlation probability threshold value should be gradually increased or decreased by taking 0.01 as a step length until the number of factors exceeding the correlation probability of 0.8 is less than 90% or more than 30% of the total number of factors.
In summary, compared with the prior art, the invention has the following advantages:
according to the invention, the association factors and the classification variables to be determined are regarded as the classification variables, the association degree of the factors is verified by using the chi-square test mode, the factors with too low association degree with the results are removed, the strongest association with the results in a plurality of high-correlation factors is reserved, the association condition of the combination factors is comprehensively considered, a suspected electricity larceny analysis model based on the high-association factors is established, and the workload of electricity larceny inspectors for inspecting the electricity larceny factors is reduced.
Drawings
FIG. 1 is a flow chart of a suspected electricity theft subject factor determination method based on chi-square test of the present invention.
Detailed Description
The present invention will be described in more detail with reference to examples.
Example 1
A method for determining suspected fraudulent use of electricity based on chi-square test (1) using chi-square test,and calculating chi-square values of all initial factors and results, wherein a calculation formula is as follows:wherein A is four values in an actual value four-grid table, T is four values in a theoretical value four-grid table, then a critical probability table is checked by inquiring a chi-square to obtain a relevant probability, and a factor larger than a threshold value is taken as a relevant factor;
(2) Removing the correlation factors with large correlation, and calculating the correlation among the correlation factors by adopting the Pelson coefficient, wherein the calculation formula is as follows:if the correlation of the two factors is greater than a set threshold, removing one of the two factors with a small correlation probability;
(3) Determining a combination association factor, combining according to a maximum combination threshold from irrelevant factors, checking and verifying the association degree of each pair of combination factors and results by using a chi-square test, and adding the combination factor into a relevant factor queue if the association probability is greater than a given threshold;
(4) The related factors obtained in the steps (2) and (3) are suspected electricity larceny subject factors.
The threshold value can be preferably selected to have a correlation probability of 0.8, and when the number of factors exceeding the correlation probability of 0.8 is more than 90% or less than 30% of the total number of factors, the correlation probability threshold value should be gradually increased or decreased by taking 0.01 as a step length until the number of factors exceeding the correlation probability of 0.8 is less than 90% or more than 30% of the total number of factors.
Wherein the relevance factor can be generally measured between 0.8 and 0.9 according to the actual factor number.
As in fig. 1: firstly, selecting initial correlation factors, then checking the correlation between the calculation factors and the results by using a chi-square, adding the correlation into a correlation factor queue, then calculating the correlation between the factors by using Pelson coefficients, and removing the factors with the high correlation; and then calculating the correlation between the combined factors and the results, adding the combined factors with large correlation into a related factor queue, and finally obtaining all the high-correlation factors.
The beneficial effects of the invention are illustrated by the following specific experiments: the experimental data are formed into a sample by 2000 electricity larceny users selected by a certain electric company of the national network and related electricity utilization data thereof during electricity larceny, 2000 non-electricity larceny users and related electricity utilization data thereof for a period of time. 55 pieces of data are selected as initial relevant factors, and the correlation degree between each suspected factor and the result of whether electricity is stolen is calculated by using chi-square test, so that 22 relevant factors are obtained; then, calculating the correlation between factors by adopting the Pelson coefficient, and if the correlation is larger than a threshold value, removing the factor with small correlation probability in the two factors; and calculating the association probability between the combination factors and the results in the removed 33 pieces of data to obtain 1 pair of combination related factors. Greatly reduces the difficulty of investigation. Thus, the electricity stealing theme model is constructed.
The undescribed portion of this embodiment is identical to the prior art.
Claims (1)
1. The suspected electricity larceny subject factor determination method based on chi-square test is characterized by comprising the following steps:
(1) The selected 2000 electricity stealing users and the related electricity consumption data thereof during electricity stealing period, the 2000 non-electricity stealing users and the related electricity consumption data thereof for a period of time form a sample together; 55 pieces of data are selected from the sample as initial related factors, wherein the initial related factors are suspected factors;
and (3) adopting chi-square test to calculate the association degree of each suspected factor and the result of whether electricity is stolen, namely calculating the chi-square value of each initial related factor and the result, wherein the calculation formula is as follows:wherein A is four values in an actual value four-grid table, T is four values in a theoretical value four-grid table, then a critical probability table is checked by inquiring a chi-square to obtain a relevant probability, and a factor larger than a threshold value is taken as a relevant factor;
(2) Removing the correlation factors with large correlation, and calculating the correlation among the correlation factors by adopting the Pelson coefficient, wherein the calculation formula is as follows:if the correlation of the two factors is greater than a set threshold, removing one of the two factors with a small correlation probability;
(3) Determining a combination association factor, combining according to a maximum combination threshold from irrelevant factors, checking and verifying the association degree of each pair of combination factors and results by using a chi-square test, and adding the combination factor into a relevant factor queue if the association probability is greater than a given threshold;
(4) The related factors obtained in the steps (2) and (3) are suspected electricity larceny subject factors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711200339.6A CN108022179B (en) | 2017-11-20 | 2017-11-20 | Suspected electricity larceny subject factor determination method based on chi-square test |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711200339.6A CN108022179B (en) | 2017-11-20 | 2017-11-20 | Suspected electricity larceny subject factor determination method based on chi-square test |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108022179A CN108022179A (en) | 2018-05-11 |
CN108022179B true CN108022179B (en) | 2024-03-26 |
Family
ID=62077125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711200339.6A Active CN108022179B (en) | 2017-11-20 | 2017-11-20 | Suspected electricity larceny subject factor determination method based on chi-square test |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108022179B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112214527A (en) * | 2020-09-25 | 2021-01-12 | 桦蓥(上海)信息科技有限责任公司 | Financial object abnormal factor screening and analyzing method based on failure response code |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198208A (en) * | 2013-03-04 | 2013-07-10 | 北京空间飞行器总体设计部 | Weight determining method applicable to small subsample condition |
CN103473438A (en) * | 2013-08-15 | 2013-12-25 | 国家电网公司 | Method for optimizing and correcting wind power prediction models |
CN103559630A (en) * | 2013-10-31 | 2014-02-05 | 华南师范大学 | Customer segmentation method based on customer attribute and behavior characteristic analysis |
CN106251049A (en) * | 2016-07-25 | 2016-12-21 | 国网浙江省电力公司宁波供电公司 | A kind of electricity charge risk model construction method of big data |
CN107145966A (en) * | 2017-04-12 | 2017-09-08 | 山大地纬软件股份有限公司 | Logic-based returns the analysis and early warning method of opposing electricity-stealing of probability analysis Optimized model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7634360B2 (en) * | 2003-09-23 | 2009-12-15 | Prediction Sciences, LL | Cellular fibronectin as a diagnostic marker in stroke and methods of use thereof |
-
2017
- 2017-11-20 CN CN201711200339.6A patent/CN108022179B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198208A (en) * | 2013-03-04 | 2013-07-10 | 北京空间飞行器总体设计部 | Weight determining method applicable to small subsample condition |
CN103473438A (en) * | 2013-08-15 | 2013-12-25 | 国家电网公司 | Method for optimizing and correcting wind power prediction models |
CN103559630A (en) * | 2013-10-31 | 2014-02-05 | 华南师范大学 | Customer segmentation method based on customer attribute and behavior characteristic analysis |
CN106251049A (en) * | 2016-07-25 | 2016-12-21 | 国网浙江省电力公司宁波供电公司 | A kind of electricity charge risk model construction method of big data |
CN107145966A (en) * | 2017-04-12 | 2017-09-08 | 山大地纬软件股份有限公司 | Logic-based returns the analysis and early warning method of opposing electricity-stealing of probability analysis Optimized model |
Non-Patent Citations (8)
Title |
---|
Minimum redundancy maximum relevance feature selection approach for temporal gene expression data;Radovic, M.等;BMC bioinformatics;20170203;第18卷(第1期);第1-14页 * |
基于云平台的文本特征选择算法研究;王军锋;CNKI优秀硕士学位论文全文库;20170415;第2017卷(第4期);正文全文 * |
基于互信息的组合特征选择算法;李叶紫等;计算机系统应用;20170831;第26卷(第8期);第173-179页 * |
基于混合特征选择的轻度认知功能障碍的诊断分类;郭宏伟等;信息技术与信息化;20151031(第10期);第165-168页 * |
基于采集系统的反窃电技术分析及防范措施;王全兴,李思韬;电测与仪表;20161231;第53卷(第07期);第78-83页 * |
基于阿尔茨海默病早期预测脑皮层厚度特征选择方法的研究;曹元磊;CNKI优秀硕士学位论文全文库;20170315;第2017卷(第3期);正文全文 * |
应用大数据技术的反窃电分析;陈文瑛等;电子测量与仪器学报;20161231;第30卷(第10期);第1558-1567页 * |
王晶舒主编.社会调查研究方法.吉林大学出版社,2014,第144-155页. * |
Also Published As
Publication number | Publication date |
---|---|
CN108022179A (en) | 2018-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105512799B (en) | Power system transient stability evaluation method based on mass online historical data | |
CN112201260B (en) | Transformer running state online detection method based on voiceprint recognition | |
CN109507554B (en) | Electrical equipment insulation state evaluation method | |
CN103218534B (en) | Right tail-truncated type lifetime data distribution selection method | |
CN111695620A (en) | Method and system for detecting and correcting abnormal data of time sequence of power system | |
CN107808100B (en) | Steganalysis method for specific test sample | |
CN111797887A (en) | Anti-electricity-stealing early warning method and system based on density screening and K-means clustering | |
CN111191671A (en) | Electrical appliance waveform detection method and system, electronic equipment and storage medium | |
CN115166563A (en) | Power battery aging state evaluation and decommissioning screening method and system | |
CN112381351A (en) | Power utilization behavior change detection method and system based on singular spectrum analysis | |
CN108022179B (en) | Suspected electricity larceny subject factor determination method based on chi-square test | |
CN112688324B (en) | Power system low-frequency oscillation mode identification method based on FastICA and TLS-ESPRIT | |
CN110020637A (en) | A kind of analog circuit intermittent fault diagnostic method based on more granularities cascade forest | |
CN108766465B (en) | Digital audio tampering blind detection method based on ENF general background model | |
CN109598245B (en) | Edible oil transverse relaxation attenuation curve signal feature extraction method based on 1D-CNN | |
CN107817784B (en) | A kind of procedure failure testing method based on concurrent offset minimum binary | |
CN106501643B (en) | A kind of detection method of the resistance to pulling force of cable connector | |
CN113014361B (en) | BPSK signal confidence test method based on graph | |
CN110658463B (en) | Method for predicting cycle life of lithium ion battery | |
CN105989095B (en) | Take the correlation rule significance test method and device of data uncertainty into account | |
CN111505445B (en) | Credibility detection method and device for mutual-user relationship of transformer area and computer equipment | |
CN115308644A (en) | Transformer winding fault detection method and system based on current offset ratio difference analysis | |
CN104794290B (en) | A kind of modal parameter automatic identifying method for mechanized equipment structure | |
CN113011261B (en) | Sinusoidal signal detection method and device based on graph | |
CN115578359B (en) | Method, system, device and storage medium for detecting defect of few samples based on generation of countermeasure network and defect-free image metric |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |