CN115033591A - Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment - Google Patents

Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment Download PDF

Info

Publication number
CN115033591A
CN115033591A CN202210617862.3A CN202210617862A CN115033591A CN 115033591 A CN115033591 A CN 115033591A CN 202210617862 A CN202210617862 A CN 202210617862A CN 115033591 A CN115033591 A CN 115033591A
Authority
CN
China
Prior art keywords
data
electricity charge
abnormal
charge data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210617862.3A
Other languages
Chinese (zh)
Inventor
卢旭
张炯华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Polytechnic Normal University
Original Assignee
Guangdong Polytechnic Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Polytechnic Normal University filed Critical Guangdong Polytechnic Normal University
Priority to CN202210617862.3A priority Critical patent/CN115033591A/en
Publication of CN115033591A publication Critical patent/CN115033591A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Primary Health Care (AREA)
  • Operations Research (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to an intelligent detection method, a system, a storage medium and computer equipment for abnormal electricity charge data, wherein the method comprises the following steps: s1, rule setting is carried out, and the rule of the abnormal electricity charge data detection is dynamically set; s2, acquiring data, and deriving the original electric charge data and the abnormal electric charge data from the database; s3, carrying out data processing, namely carrying out missing value processing, feature coding and feature selection processing on the electric charge original data; s4, data mining is carried out, hidden abnormal data are mined out, and the abnormal type is detected and obtained; s5, constructing a model, training an algorithm model, and dynamically adjusting model parameters; and S6, performing model prediction, inputting data into the model for prediction, and obtaining a final abnormal electricity charge data detection result. The method and the device can efficiently detect and identify the abnormal data of the electric charge, improve the detection level of the abnormal data of the electric charge, effectively improve the hit rate of the abnormal data of the electric charge and improve the intelligent detection level of the abnormal data of the electric charge.

Description

Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment
Technical Field
The invention relates to the technical field of machine learning, in particular to an intelligent detection method, system, storage medium and computer equipment for electricity charge data abnormity.
Background
According to the traditional electricity charge data abnormity detection method, the manual experience is summarized, the electricity charge data abnormity screening rule is extracted, according to investigation, the existing electricity charge data abnormity screening rule of a power grid company reaches dozens to hundreds, if all the rules need to be traversed once during each time of electricity charge abnormity error data checking, heavy workload can be brought to an electricity marketing department, and the operation efficiency of the electricity department is reduced. In the existing electric charge data anomaly detection and accounting rules, a rule that a plurality of variables can be adjusted exists, the rules are divided according to the adjustable variables of the rules, the rules comprise two types of 'reference electric quantity' and 'fluctuation rate', the two types of 'reference electric quantity' and 'fluctuation rate' can be manually adjusted in the electric power marketing process, and the applicability, effectiveness and reasonability of variable parameter setting can directly influence the electric charge error abnormal data quantity generated by the electric power marketing system, so that the working efficiency of electric charge accounting is further influenced. On the other hand, some of the existing rules change their detection effect with the change of months. Therefore, the existing electricity charge data anomaly detection technology has long time consumption and low hit rate for detecting the electricity charge data anomaly, and cannot meet the requirement of smart grid construction.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides an intelligent detection method, a system, a storage medium and a computer device for abnormal electricity charge data, which can efficiently detect and identify abnormal electricity charge data, improve the detection level of an electricity charge company on the abnormal electricity charge data, effectively improve the hit rate of the abnormal electricity charge data and further improve the intelligent detection level of the abnormal electricity charge data of the electricity charge company.
The method is realized by adopting the following technical scheme: an intelligent detection method for abnormal electricity charge data comprises the following steps:
s1, rule setting is carried out, and the rule of the abnormal electricity charge data detection is dynamically set;
s2, acquiring data, and deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
s3, carrying out data processing, namely carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
s4, performing data mining, mining hidden abnormal data from the processed electric charge original data and detecting the abnormal type of the acquired electric charge data;
s5, model construction is carried out, a machine learning algorithm model is built according to data analysis results of data mining, the algorithm model is trained, and model parameters are dynamically adjusted in the model training process;
and S6, carrying out model prediction, constructing an obtained electricity charge data abnormity intelligent detection model according to the model, and inputting data into the model for prediction after acquiring the original electricity charge data to obtain a final electricity charge data abnormity detection result.
The system of the invention is realized by adopting the following technical scheme: an intelligent detection method for abnormal electricity charge data comprises the following steps:
the rule setting model is used for dynamically setting the rule of the abnormal detection of the electricity charge data;
the data acquisition module is used for deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
the data processing module is used for carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
the data mining module is used for mining hidden abnormal data from the processed electric charge raw data and detecting and acquiring abnormal types of the electric charge data;
the model building module is used for building a machine learning algorithm model according to a data analysis result of data mining, training the algorithm model and dynamically adjusting model parameters in the model training process;
and the model prediction module is used for constructing an intelligent abnormal electricity charge data detection model according to the model, inputting the data into the model for prediction after acquiring the original electricity charge data, and obtaining a final abnormal electricity charge data detection result.
The present invention also proposes a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the intelligent detection method of electricity charge data abnormality of the present invention.
The invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the computer program, the intelligent detection method for the abnormal electricity charge data is realized.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. according to the invention, through carrying out a series of data processing such as missing value processing, feature coding, feature selection and the like on the electricity charge data and combining a related data mining method, the hidden information of the abnormal electricity charge data is greatly mined, and the capability of detecting the abnormal electricity charge data is further improved.
2. The method can solve the problem that the hit rate of the existing method for detecting the abnormal electricity fee data is low, so that the finally detected suspected abnormal electricity fee data is greatly reduced, the workload of the abnormal rechecker of the basic electricity fee data is greatly reduced, the operation cost of the electricity fee checking department of a power grid company is reduced, and a large amount of manpower and material resources are saved.
3. According to the method, a weighted residual deep forest model is constructed, the difference among deep forest subtrees obtained by training on a power charge data set can be reduced in the weighting process of the model, the weighted deep forest gives a subtree with high accuracy rate for predicting abnormal power charge data with larger weight so as to increase the function of the subtree in decision, so that the accuracy rate of abnormal hit of the power charge data is effectively improved, the number of layers of cascaded forests is reduced, and the training time is shortened; meanwhile, the model can make up the defect that gradient disappearance or gradient explosion possibly occurs in the deep forest algorithm, and the ability of learning the abnormal features of the electric charge data can be continuously increased under the condition that the number of the deep forest cascade layers is increased and on the basis of keeping the previous model effect.
4. The intelligent detection model for the abnormal electricity charge data based on machine learning is established, the model is based on machine learning algorithms such as a weighted residual error deep forest model and XGBOOST, various algorithms are optimized and model fusion is carried out, the abnormal electricity charge data are detected to the maximum extent, compared with the existing abnormal electricity charge data detection method based on rules, the intelligent detection model has the advantages of being fast in response time, short in detection time and low in omission ratio, the environment required by field configuration and operation is simple, safety is high, and the intelligent detection model has high practical application value.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a data processing flow diagram of the present invention;
FIG. 3 is a flow chart of the present invention for electric utility data mining;
FIG. 4 is a flow chart of the model construction of the present invention;
FIG. 5 is a residual forest flow diagram of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
As shown in fig. 1, the intelligent detection method for electricity charge data abnormality in the embodiment includes the following steps:
s1, rule setting is carried out, and the rule of the electricity charge data abnormity detection is dynamically set;
s2, acquiring data, and deriving the original electric charge data and the abnormal electric charge data of all the electricity users in the previous month from the database;
s3, carrying out data processing, namely carrying out missing value processing, feature coding, feature selection and other processing on the original electric charge data;
s4, data mining is carried out, hidden abnormal data are mined from the processed electric charge original data, and the abnormal type of the rough electric charge data is detected and obtained;
s5, model construction is carried out, a machine learning algorithm model is built according to data analysis results of data mining, the algorithm model is trained, and model parameters are dynamically adjusted in the model training process;
and S6, carrying out model prediction, constructing an obtained electricity charge data abnormity intelligent detection model according to the model, and inputting data into the model for prediction after acquiring original electricity charge data of the same edition in the same month to obtain a final electricity charge data abnormity detection result.
Specifically, as shown in fig. 2, the specific procedure of the data processing in step S3 is as follows:
s301, marking the collected electric charge original data and the collected abnormal electric charge data, adding a feature for marking whether the electric charge data is abnormal or not, marking all the original electric charge data, wherein the abnormal electric charge data is marked as 1, and the non-abnormal electric charge data is marked as 0;
s302, performing missing value processing on all marked electricity charge original data, if more than 10 characteristics of a certain row of electricity charge data have missing values, directly deleting the row of data, and performing filling processing on other missing values, wherein the missing value filling methods are all filling numerical values of-1;
s303, carrying out feature coding on text characters in the electric charge original data, wherein the feature coding mode can select text coding and unique hot coding;
s304, feature importance ranking is carried out on the electricity charge original data, a machine learning algorithm adopted by the feature importance ranking of the electricity charge original data can select random forests, XGBOOST, SVM and the like, and finally the importance ranking of all the electricity charge data features is obtained;
s305, all features of the electricity charge data are selected, the feature selection can refer to the feature importance sorting result, and a plurality of features of the last bit of the feature importance sorting are deleted, so that the hit rate and the efficiency of the abnormal intelligent detection of the electricity charge data are improved.
Specifically, as shown in fig. 3, the specific process of data mining in step S4 is as follows:
s401, carrying out abnormality degree grading on the electricity charge data samples by using an isolated forest algorithm, carrying out abnormality degree sorting according to the abnormality degree grading, taking the electricity charge data samples with the abnormality degree grading ranked in the top 70% as normal electricity charge data samples with high reliability, screening the normal electricity charge data samples with high reliability, and further eliminating the influence of untrue electricity charge data on model construction;
s402, determining the cluster number and clustering of a plurality of types of samples of the electric charge original data, clustering the electric charge original data by using a K-means algorithm, and calculating the optimal clustering cluster number K of the plurality of types of samples, so that the plurality of types of samples are clustered into K clusters;
s403, sampling each cluster after the majority of samples are gathered into k clusters, wherein the majority of samples can be subjected to undersampling in the sampling process, and the undersampling algorithm can be a random undersampling algorithm;
s404, data mining is conducted on the finally sampled electric charge data, data mining is conducted on the electric charge data through combining the existing electric charge data abnormity detection rule and the machine learning algorithm, accordingly, hidden abnormal data are mined out, and the approximate abnormal type of the electric charge data is obtained through detection.
Specifically, as shown in fig. 4, the specific process of model building in step S5 is as follows:
s501, dividing the electricity charge data into a training set and a testing set, and dividing the electricity charge data into the training set and the testing set according to the proportion of 7:3 or 8:2 by adopting a random sampling method;
s502, carrying out feature interactive processing on the electricity charge data, respectively obtaining combined and derived training prediction results between features by using algorithms such as random forest and XGBOOST, comparing the results with original data training prediction results without feature interaction, and comparing the adopted indexes, namely recall ratio and precision ratio of abnormality of the electricity charge data, wherein the recall ratio reflects the condition of missing detection of the electricity charge abnormal data, the precision ratio reflects the condition of successful hit detection of the electricity charge abnormal data, the feature interactive results are synthesized, and multi-feature combined and derived new features are further constructed;
s503, constructing an intelligent abnormal electricity charge data detection model, training and predicting electricity charge data by constructing a weighted residual deep forest model and machine learning algorithms such as a decision tree, a random forest, XGBOOST, CATBOOST and the like to obtain the trained intelligent abnormal electricity charge data detection model;
s504, performing parameter adjustment on the intelligent abnormal electricity charge data detection model, wherein the selected parameter adjustment method can be a greedy parameter adjustment method, a grid parameter adjustment method or a Bayesian parameter adjustment method, and finally obtaining the optimal parameters of the intelligent abnormal electricity charge data detection model;
and S505, performing algorithm fusion on multiple reference algorithms with adjusted parameters, wherein the reference algorithms can be algorithms such as weighted residual deep forest, decision tree, random forest, XGB OST, CATBOOST and the like, and the fused machine learning model is a final intelligent detection model for abnormal electricity charge data.
Specifically, in this embodiment, the specific process of constructing the weighted residual depth forest model in step S503 is as follows:
set the electricity charge data set S ═ { N ═ N 1 ,N 2 ,…,N m The category is L ═ L 1 ,L 2 In which L is 1 Representing an abnormal electricity charge, L 2 Representing non-electricity charge anomaly data, the prediction probability matrix of the weighted residual depth forest is represented as follows:
Figure BDA0003675225280000051
wherein, T ij Representing prediction probability of ith electricity charge data divided into jth class by weighting position of maximum value of each row in prediction probability matrix of residual depth forestThe subscript j is used as the final prediction category of the piece of electricity charge data, the position of the value in the prediction probability matrix is marked as 1, and the rest values are marked as 0, as follows:
Figure BDA0003675225280000052
and calculating the accuracy of the weighted residual depth forest according to the following formula:
Figure BDA0003675225280000053
wherein m represents the total number of the electric charge data, A [ i ] [ j ] represents a distribution matrix of the actual category of the electric charge data, T [ i ] [ j ] represents a prediction probability matrix of the weighted residual depth forest, the number of the electric charge data with correct prediction is obtained by taking intersection, and then the ratio of the number of the electric charge data with the total number m of the electric charge data is calculated, so that the final accuracy of prediction of each weighted residual depth forest is obtained;
assuming that the weighted residual depth forest F is {1,2, …, F }, the weight can be calculated according to the accuracy of each forest, and defined as η, which is expressed as follows:
Figure BDA0003675225280000061
wherein, P i And representing the prediction accuracy of the ith forest, and obtaining a weighted prediction probability matrix of the ith forest as follows:
T (i) =T×η
and taking the probability result of the weighted prediction probability matrix of each forest as the input of the next forest cascade layer until the maximum cascade forest layer number is reached or the accuracy of the forest prediction result is not improved any more, and stopping iteration.
In this embodiment, as shown in fig. 5, in order to avoid the problem of gradient explosion or disappearance while increasing the number of forest layers in the deep forest, a structure similar to a residual error network is adopted to further form a weighted residual error deep forest, and the specific process is as follows:
inputting the characteristics of the electric charge data, inputting the characteristic values after weighted deep forest multi-granularity scanning into a completely random forest and an extremely random forest, and because the abnormal detection of the electric charge data is a problem of two classifications, each random forest finally generates two classification results, storing the results and inputting the results and the multi-granularity scanning result of the next layer of forest into the forest of each layer behind until the maximum cascade forest layer number is reached, or the accuracy of the forest prediction result is not improved any more, and stopping iteration.
Based on the same inventive concept, the invention also provides an electricity charge data abnormal intelligent detection system, which comprises:
the rule setting model is used for dynamically setting the rule of the abnormal detection of the electricity charge data;
the data acquisition module is used for deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
the data processing module is used for carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
the data mining module is used for mining hidden abnormal data from the processed electric charge raw data and detecting and acquiring abnormal types of the electric charge data;
the model construction module is used for establishing a machine learning algorithm model according to a data analysis result of data mining, training the algorithm model and dynamically adjusting model parameters in the model training process;
and the model prediction module is used for constructing an obtained electricity charge data abnormity intelligent detection model according to the model, inputting the data into the model for prediction after acquiring the original electricity charge data, and obtaining a final electricity charge data abnormity detection result.
In addition, the invention also provides a storage medium and computer equipment. Wherein the storage medium has stored thereon a computer program which, when executed by the processor, implements the steps S1-S6 of the electricity fee data abnormality intelligent detection method of the present invention. The computer device comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and when the processor executes the computer program, the intelligent detection method for the electricity charge data abnormity of the invention is realized, namely the intelligent detection method comprises the processes of the steps S1-S6.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (8)

1. An intelligent detection method for abnormal electricity charge data is characterized by comprising the following steps:
s1, rule setting is carried out, and the rule of the abnormal electricity charge data detection is dynamically set;
s2, data acquisition is carried out, and original electric charge data and abnormal electric charge data of all electric users are derived from the database;
s3, carrying out data processing, namely carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
s4, performing data mining, mining hidden abnormal data from the processed electric charge original data and detecting the abnormal type of the acquired electric charge data;
s5, model construction is carried out, a machine learning algorithm model is built according to the data analysis result of data mining, the algorithm model is trained, and model parameters are dynamically adjusted in the model training process;
and S6, carrying out model prediction, constructing an obtained electricity charge data abnormity intelligent detection model according to the model, and inputting data into the model for prediction after acquiring the original electricity charge data to obtain a final electricity charge data abnormity detection result.
2. The intelligent detection method for electricity charge data abnormality according to claim 1, wherein the specific process of data processing in step S3 is as follows:
s301, marking the collected electric charge original data and the electric charge abnormal data, adding a feature for marking whether the electric charge data is abnormal or not, marking all the original electric charge data, marking the abnormal electric charge data as 1, and marking the non-abnormal electric charge data as 0;
s302, performing missing value processing on all marked electricity charge original data, if more than 10 characteristics of a certain row of electricity charge data have missing values, directly deleting the row of data, and performing filling processing on other missing values, wherein the missing value filling methods are all filling numerical values of-1;
s303, carrying out feature coding on text characters in the electric charge original data, wherein the feature coding mode is text coding and one-hot coding;
s304, performing characteristic importance ranking on the electric charge original data, wherein the characteristic importance ranking of the electric charge original data adopts a machine learning algorithm of random forest, XGBOOST and SVM, and finally obtaining the importance ranking of all electric charge data characteristics;
s305, all features of the electric charge data are selected, the feature selection refers to the feature importance sorting result, and a plurality of features of the last bit of the feature importance sorting are deleted.
3. The intelligent detection method for electricity charge data abnormity according to claim 1, characterized in that the specific process of data mining in step S4 is as follows:
s401, carrying out abnormality degree grading on the electricity charge data samples by using an isolated forest algorithm, carrying out abnormality degree sorting according to the abnormality degree grading, and taking the electricity charge data samples with the abnormality degree grading ranked in the top 70% as normal electricity charge data samples with high reliability;
s402, determining the cluster number and clustering of most samples of the electric charge original data, clustering the electric charge original data by using a K-means algorithm, calculating the optimal cluster number K of the most samples, and clustering the most samples into K clusters;
s403, after the majority of samples are gathered into k clusters, sampling each cluster, and performing undersampling on the majority of samples in the sampling process, wherein the undersampling algorithm utilizes a random undersampling algorithm;
and S404, performing data mining on the finally sampled electric charge data, performing data mining on the electric charge data by combining an electric charge data abnormity detection rule and a machine learning algorithm, mining hidden abnormal data and detecting and acquiring an abnormal type of the electric charge data.
4. The intelligent detection method for abnormal electricity charge data according to claim 1, wherein the specific process of model construction in step S5 is as follows:
s501, dividing the electricity charge data into a training set and a testing set, and dividing the electricity charge data into the training set and the testing set according to the proportion of 7:3 or 8:2 by adopting a random sampling method;
s502, carrying out feature interaction processing on the electricity charge data, respectively obtaining combined and derived training prediction results between features by using random forest and XGBOOST algorithms, comparing the results with original data training prediction results without feature interaction, and comparing the adopted indexes, namely recall ratio and precision ratio of abnormality of the electricity charge data, wherein the recall ratio reflects the condition of missing detection of the electricity charge abnormal data, the precision ratio reflects the condition of successful hit detection of the electricity charge abnormal data, and the feature interaction results are synthesized to construct new features of multi-feature combination and derivation;
s503, constructing an intelligent abnormal electricity charge data detection model, training and predicting electricity charge data through constructing a weighted residual deep forest model and a machine learning algorithm of a decision tree, a random forest, an XGBOOST and a CATBOOST respectively to obtain the trained intelligent abnormal electricity charge data detection model;
s504, performing parameter adjustment on the intelligent abnormal electricity charge data detection model, wherein the selected parameter adjustment method is a greedy parameter adjustment method, a grid parameter adjustment method or a Bayesian parameter adjustment method, and finally obtaining the optimal parameters of the intelligent abnormal electricity charge data detection model;
and S505, performing algorithm fusion on multiple reference algorithms of the adjusted parameters, wherein the reference algorithms are weighted residual error deep forest, decision tree, random forest, XGB OST and CATBOOST algorithms, and the fused machine learning model is a final intelligent detection model for abnormal electricity charge data.
5. The intelligent detection method for electricity charge data abnormity according to claim 4, wherein the specific process of building the weighted residual depth forest model in step S503 is as follows:
set the electricity charge data set S ═ { N ═ N 1 ,N 2 ,…,N m The category is L ═ L 1 ,L 2 In which L is 1 Representing an abnormal electricity charge, L 2 Representing non-electricity charge anomaly data, the prediction probability matrix of the weighted residual depth forest is represented as follows:
Figure FDA0003675225270000021
wherein, T ij The prediction probability that the ith piece of electricity charge data is divided into the jth type is represented, the subscript j of the position of the maximum value of each row in the prediction probability matrix of the weighted residual depth forest is used as the final prediction type of the electricity charge data, the position of the value in the prediction probability matrix is marked as 1, and the rest values are marked as 0, and the method comprises the following steps:
Figure FDA0003675225270000031
and calculating the accuracy of the weighted residual depth forest according to the following formula:
Figure FDA0003675225270000032
wherein m represents the total number of the electricity charge data, A [ i ] [ j ] represents a distribution matrix of the actual category of the electricity charge data, T [ i ] [ j ] represents a prediction probability matrix of the weighted residual error depth forest, the number of the electricity charge data with correct prediction is obtained by taking intersection, then the ratio of the number of the electricity charge data with the total number m of the electricity charge data is calculated, and the final prediction accuracy of each weighted residual error depth forest is obtained;
assuming that the weighted residual depth forest F is {1,2, …, F }, a weight is calculated according to the accuracy of each forest, and the weight is defined as η, which is expressed as follows:
Figure FDA0003675225270000033
wherein, P i And representing the prediction accuracy of the ith forest, and obtaining a weighted prediction probability matrix of the ith forest as follows:
T (i) =T×η
taking the probability result of the weighted prediction probability matrix of each forest as the input of the next forest cascade layer after weighting until the maximum cascade forest layer number is reached or the accuracy of the forest prediction result is not improved any more, and stopping iteration;
forming a weighted residual depth forest by using the structure of a residual network, which specifically comprises the following steps: inputting the characteristics of the electric charge data, and inputting the characteristic values after weighted depth forest multi-granularity scanning into a complete random forest and an extreme random forest; and storing two classification results generated by each random forest and inputting the two classification results and the multi-granularity scanning result of the next layer of forest into each layer of forest at the back until the maximum cascade forest layer number is reached, and stopping iteration.
6. An abnormal electricity charge data intelligent detection system, comprising:
the rule setting model is used for dynamically setting the rule of the abnormal detection of the electricity charge data;
the data acquisition module is used for deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
the data processing module is used for carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
the data mining module is used for mining hidden abnormal data from the processed electric charge raw data and detecting and acquiring abnormal types of the electric charge data;
the model construction module is used for establishing a machine learning algorithm model according to a data analysis result of data mining, training the algorithm model and dynamically adjusting model parameters in the model training process;
and the model prediction module is used for constructing an obtained electricity charge data abnormity intelligent detection model according to the model, inputting the data into the model for prediction after acquiring the original electricity charge data, and obtaining a final electricity charge data abnormity detection result.
7. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the electricity charge data abnormality intelligent detection method according to any one of claims 1 to 5.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the intelligent electricity charge data abnormality detection method according to any one of claims 1 to 5 when executing the computer program.
CN202210617862.3A 2022-06-01 2022-06-01 Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment Pending CN115033591A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210617862.3A CN115033591A (en) 2022-06-01 2022-06-01 Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210617862.3A CN115033591A (en) 2022-06-01 2022-06-01 Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN115033591A true CN115033591A (en) 2022-09-09

Family

ID=83123415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210617862.3A Pending CN115033591A (en) 2022-06-01 2022-06-01 Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN115033591A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905319A (en) * 2022-11-16 2023-04-04 国网山东省电力公司营销服务中心(计量中心) Automatic identification method and system for abnormal electricity charges of massive users
CN116048912A (en) * 2022-12-20 2023-05-02 中科南京信息高铁研究院 Cloud server configuration anomaly identification method based on weak supervision learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905319A (en) * 2022-11-16 2023-04-04 国网山东省电力公司营销服务中心(计量中心) Automatic identification method and system for abnormal electricity charges of massive users
CN115905319B (en) * 2022-11-16 2024-04-19 国网山东省电力公司营销服务中心(计量中心) Automatic identification method and system for abnormal electricity fees of massive users
CN116048912A (en) * 2022-12-20 2023-05-02 中科南京信息高铁研究院 Cloud server configuration anomaly identification method based on weak supervision learning

Similar Documents

Publication Publication Date Title
WO2022110557A1 (en) Method and device for diagnosing user-transformer relationship anomaly in transformer area
CA3088899C (en) Systems and methods for preparing data for use by machine learning algorithms
CN115033591A (en) Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment
CN106095639A (en) A kind of cluster subhealth state method for early warning and system
CN107103332A (en) A kind of Method Using Relevance Vector Machine sorting technique towards large-scale dataset
CN110930198A (en) Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
CN103390154A (en) Face recognition method based on extraction of multiple evolution features
CN113792754A (en) Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
CN109711707B (en) Comprehensive state evaluation method for ship power device
CN111507504A (en) Adaboost integrated learning power grid fault diagnosis system and method based on data resampling
CN112363896A (en) Log anomaly detection system
CN112464996A (en) Intelligent power grid intrusion detection method based on LSTM-XGboost
CN108830407B (en) Sensor distribution optimization method in structure health monitoring under multi-working condition
CN113112067A (en) Method for establishing TFRI weight calculation model
Dong et al. Research on academic early warning model based on improved SVM algorithm
CN115392582A (en) Crop yield prediction method based on incremental fuzzy rough set attribute reduction
CN112052952B (en) Monitoring parameter optimization selection method in diesel engine fault diagnosis based on genetic algorithm
CN115422821A (en) Data processing method and device for rock mass parameter prediction
CN111654853B (en) Data analysis method based on user information
CN115018161A (en) Intelligent rock burst prediction method based on African bald eagle optimization random forest model
CN112801367A (en) Fault prediction method based on ARMret model considering rare variables
CN113448840A (en) Software quality evaluation method based on predicted defect rate and fuzzy comprehensive evaluation model
CN116776134B (en) Photovoltaic output prediction method based on PCA-SFFS-BiGRU
CN116365519B (en) Power load prediction method, system, storage medium and equipment
CN112633622B (en) Smart power grid operation index screening method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination