CN117972595A - Method, system, device and medium for analyzing electric charge abnormality - Google Patents
Method, system, device and medium for analyzing electric charge abnormality Download PDFInfo
- Publication number
- CN117972595A CN117972595A CN202311556590.1A CN202311556590A CN117972595A CN 117972595 A CN117972595 A CN 117972595A CN 202311556590 A CN202311556590 A CN 202311556590A CN 117972595 A CN117972595 A CN 117972595A
- Authority
- CN
- China
- Prior art keywords
- electric charge
- factor
- abnormal
- user
- rule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 230000005856 abnormality Effects 0.000 title claims description 59
- 230000002159 abnormal effect Effects 0.000 claims abstract description 113
- 238000004458 analytical method Methods 0.000 claims abstract description 32
- 230000005611 electricity Effects 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000007637 random forest analysis Methods 0.000 claims description 51
- 238000003066 decision tree Methods 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 18
- 238000012360 testing method Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 11
- 230000000875 corresponding effect Effects 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 7
- 238000005070 sampling Methods 0.000 claims description 6
- 230000036961 partial effect Effects 0.000 claims description 4
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 230000002596 correlated effect Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims 2
- 238000003745 diagnosis Methods 0.000 abstract description 10
- 238000001514 detection method Methods 0.000 abstract description 8
- 238000004364 calculation method Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 11
- 230000002829 reductive effect Effects 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000002372 labelling Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 108010014173 Factor X Proteins 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- BTCSSZJGUNDROE-UHFFFAOYSA-N gamma-aminobutyric acid Chemical compound NCCCC(O)=O BTCSSZJGUNDROE-UHFFFAOYSA-N 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013215 result calculation Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 238000002759 z-score normalization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
An electric charge anomaly analysis method, system, device and medium are characterized in that the method comprises the following steps: step 1, importing an electric charge abnormal rule, training the electric charge abnormal rule, and obtaining an electric charge abnormal rule learning model; and 2, inputting user data of the user to be analyzed into the learning model, and outputting the abnormal state of the electric charge of the user. The invention reduces the selection of manual features, provides support for realizing more accurate, self-adaptive and suitable for the detection of the abnormal electricity fee in the complex electricity consumption environment, and ensures the diagnosis of the abnormal electricity fee to have more depth and accuracy.
Description
Technical Field
The invention relates to the field of power systems, in particular to a method, a system, a device and a medium for analyzing electric charge abnormality.
Background
With further release of the electric power market, the electric charge settlement scene is more complex, the mode of accounting problem analysis is developed by relying on traditional business experience, the working requirements under the intensive background of the electricity can not be met, and in the scene of electric charge policy adjustment, the organization business expert is required to comb the problem scene affecting the accounting charge, and the problem analysis is developed one by one. The electric charge settlement not only relates to economic problems, but also causes the occurrence of abnormal electric charges possibly due to factors such as the faults of electric charge metering equipment, the change of a user power consumption mode, the abnormality of a power line and the like, so that the extraction and analysis of the electric charge abnormality can help an electric energy marketing system to timely discover the faults of the metering equipment, unreasonable problems existing in the application of the metering equipment and the like.
The existing electric charge abnormality detection method is often based on static rules, cannot accurately capture complex electric charge abnormality modes, and causes inaccurate electric charge abnormality detection, so that excessive false positives and false negatives are caused, and energy management becomes difficult. In order to save labor cost and improve anomaly analysis efficiency, a set of intelligent anomaly problem analysis method needs to be established, anomaly factors influencing accounting are mined from data, and business personnel are assisted to rapidly develop problem analysis, so that a specific anomaly problem scene is constructed.
Under the background that the electricity charge settlement scene is more complex, a set of accounting rule base and rule factor base with unified specification are lacking, and before the online power marketing 2.0 systems of different network provinces, a set of accounting rule base is added according to the existing accounting rules of the provinces, so that the redundancy of the accounting rule base is higher and higher, and the screening of abnormal electricity charge users is not facilitated in an automatic mode.
In addition, in the face of complex accounting anomaly problem scenes, an accurate anomaly diagnosis method is lacking in the prior art. In a complex accounting problem scene, business personnel need to be relied on to carefully check various parameters related to the accounting fee, so that problem positioning and correction are completed. However, with the continuous deepening of intensive work of the accounting province, the time limit requirement for the abnormal processing of the accounting is continuously improved, so that in order to reduce the difficulty of the abnormal problem diagnosis and improve the abnormal processing efficiency, accurate abnormal diagnosis needs to be realized, the cost calculation parameters causing the abnormal problem are rapidly positioned, and the service personnel are assisted to complete the problem elimination in time.
Various information of electricity users, such as user profile data, business change data, price and charge data and the like, are recorded in the electric power marketing 2.0 system. The data contains a large number of fields with different dimensions, so that the reasons and types of the electric charge abnormality can be reflected from various aspects. However, in a fully automated manner, a method for accurately and reasonably extracting the electricity rate abnormality factor for determining the abnormality of the electricity rate from these data contents has not been achieved.
The existing extraction method of the electric charge abnormal factors mainly comprises two types: the method based on rules, the method based on statistics and the method based on machine learning have the main disadvantages that:
(1) The method based on the rules extracts terms according to the part of speech and lexical rule templates which are compiled in advance by the domain expert, so that the extraction effect is completely dependent on the formulation of the rules and the quality of the templates, stronger grammar knowledge and domain background knowledge are required, the generalization capability of the model is poor, and conflicts and errors can occur when the rules are complex.
(2) The statistical method based on the word frequency, TFIDF, chi-square test, log-likelihood test, left and right entropy, mutual information and other statistical features of the words are utilized to realize the extraction of the terms.
(3) The method based on machine learning converts the term extraction problem into a text classification or sequence labeling problem, trains a model on a labeled large-scale corpus, and predicts unlabeled corpus by using the trained model. The method avoids making complex rules, has strong universality, but can realize learning by marking a large amount of corpus.
In addition, when a plurality of electric charge abnormality factors are extracted, how to determine the cause, type and degree of the electric charge abnormality from the manifestation of the plurality of interrelated, independent or contradictory electric charge abnormality factors is still a problem to be solved. The existing electric charge anomaly factor weight calculation method mainly comprises two types: the method for assigning the weight based on the analytic hierarchy process and the method for assigning the weight based on the entropy weight have the main defects that:
(1) Weighting is based on an analytic hierarchy process, namely, a hierarchical structure model is established until a pair of comparison matrixes are given, and the weight is determined through expert judgment, but subjective factors of people are heavy, the dominant position of the method is occupied on expert dependence, and different expert judgment can cause different results. The quantitative data are less, and the method is not easy to convince; the comparison, judgment and result calculation processes are rough and are not suitable for the problem of higher precision.
In a complex electricity utilization environment, when facing a complex data relationship, the analytic hierarchy process is easily influenced by individual subjective opinion, knowledge structure, post experience, personal quality and the like, and subjective unilateralness is generated, and fairness and scientificity are lacked, so that the unreasonable weight distribution condition is caused.
(2) The calculation of the weight based on the entropy weight method requires a large amount of data, the data quality requirement is high, and particularly under the condition of multiple factors, enough data is needed to calculate the entropy value, otherwise, the weight result may be inaccurate; although entropy weighting generally does not require subjective weight settings, human intervention is still required in determining the reference value of the data, which may lead to a degree of subjectivity; for some non-quantized factors it is difficult to weight using objective methods; entropy weighting generally assumes that the factors are independent of each other, and thus may not reflect the actual situation well in the case of multi-factor correlations.
In view of the foregoing, there is a need for a method, system, apparatus, and medium for analyzing an electric charge abnormality.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a method, a system, a device and a medium for analyzing the abnormal electricity fee, wherein the method is used for realizing factor classification on the basis of adjusting factor weights by extracting a judgment field based on a BERT-BiLSTM-CRF model and constructing an abnormal electricity fee rule factor library by a random forest model.
The invention adopts the following technical scheme.
The first aspect of the invention relates to a method for analyzing electric charge abnormality, which comprises the following steps: step 1, importing an electric charge abnormal rule, training the electric charge abnormal rule, and obtaining an electric charge abnormal rule learning model; and 2, inputting user data of the user to be analyzed into the learning model, and outputting an electric charge abnormal state of the user.
Preferably, the electric charge abnormality rule includes a forced abnormality rule and an abnormality prompt rule; the forced abnormal rules comprise a registration abnormal rule, a fee measuring abnormal rule and an archive abnormal rule.
Preferably, the acquiring the electric charge abnormal rule learning model further includes: training the abnormal electric charge rule corpus based on the BERT pre-training language model to obtain vector representation of characters in the abnormal electric charge rule text; and constructing an electric charge anomaly rule learning model by adopting BiLSTM-CRF, training the vector representation, outputting a predictive tag sequence, and confirming a judging field in the electric charge anomaly rule based on the predictive tag sequence.
Preferably, the judgment field realizes the division of judgment types based on the types of the electric charge exception rules; the judging type of the judging field comprises file exception, registration exception and charge exception in the mandatory class and the mandatory class of the prompt class.
Preferably, user data is acquired from a data interface of the marketing system, and a judgment factor is extracted from the user data by utilizing a judgment field; based on the judgment factors, constructing judgment factor sets of all users, classifying the users by adopting a random forest model, and generating factor types corresponding to different abnormal rules according to classification results of the random forest model.
Preferably, classifying the users by using a random forest model, further comprising: randomly sampling from all users to construct a decision user set; randomly extracting one or more decision factors from the decision factor set of the decision user set, and splitting the users in the decision user set by taking the randomly extracted decision factors as characteristics until a decision tree is obtained; and repeatedly obtaining K decision trees in an iterative mode to form a random forest model in the electric charge abnormal rule learning model.
Preferably, factor type labels are preset for part of all users; and comparing the difference between the output classification result of the random forest model to the partial users and the factor type labels, and iteratively adjusting the random forest model based on the difference.
Preferably, the method includes randomly extracting one or more decision factors, splitting the users in the user set, and further includes: the factor weight of the judgment factor is predefined by utilizing the frequency statistics result of the abnormal rule factors of the electric charge; and evaluating the importance of each judgment factor by adopting a random forest model, and if the importance of the judgment factor is not matched with the factor weight, adjusting the random forest model until the importance of the judgment factor is matched with the factor weight.
Preferably, the method for evaluating the importance of each judgment factor by using a random forest model further comprises: inputting users in the test set into the random forest model for classification to obtain a first error rate of the random forest model; adding noise on the current judgment factors of each user in the test set, and inputting the noise into the random forest model for classification to obtain a second error rate of the random forest model; and calculating the deviation degree between the first error rate and the second error rate, which corresponds to the current judgment factors, calculating the sum of the deviation degrees of all the judgment factors in the current decision tree, and assigning a value to the importance of each judgment factor based on the sum of the deviation degrees.
Preferably, the assigning is performed for importance of each of the determination factors based on the degree of deviation, further comprising: the importance of each decision factor is positively correlated with the sum of the degree of deviation.
Preferably, the user data includes user profile data, service change data, and price data; the user profile data at least comprises a power consumption type, a marketization attribute, a voltage level, a metering mode, an operation capacity, a contract capacity, a pricing strategy type, a power factor checking mode, a basic electric charge calculation mode, a power quantity, an electric quantity calculation mode, a participation power factor calculation mode, a temporary power consumption sign, an industry category, a power consumption category and a time-sharing power consumption sign of a user; the service change data at least comprises new capacity increasing, suspending, capacity reducing, class changing, metering equipment fault processing, pressure changing, metering equipment changing, suspending and recovering, capacity reducing and recovering and power receiving facility modifying of a user; the price data at least comprises the price type, the power transmission and distribution price, the electricity price, the additional power receiving price, the last meter reading indication, the current meter reading indication, the required quantity indication, the total reactive power, the basic electricity charge, the electricity price and the power adjustment electricity charge of the user.
Preferably, the adjusted random forest model is used for constructing an electric charge abnormal rule factor library in a form of four elements of a judgment field, a judgment type, a factor type and a factor weight, and the electric charge abnormal rule factor library is stored.
The second aspect of the invention relates to an electric charge abnormity analysis system utilizing the method in the first aspect of the invention, wherein the system comprises a training module and an output module; the training module is used for importing the electric charge abnormal rule, training the electric charge abnormal rule and acquiring an electric charge abnormal rule learning model; and the output module is used for inputting the user data of the user to be analyzed into the learning model and outputting the abnormal state of the electric charge of the user.
A third aspect of the present invention relates to a terminal, comprising a processor and a storage medium; the storage medium is used for storing instructions; the processor is operative to perform the steps of the method of the first aspect of the invention in accordance with the instructions.
The fourth aspect of the present invention relates to a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the method of the first aspect of the present invention. Compared with the prior art, the method, the system, the device and the medium for analyzing the electric charge abnormality have the beneficial effects that the method is used for realizing factor classification on the basis of adjusting factor weights through a random forest model by extracting the judgment field based on the BERT-BiLSTM-CRF model, so that an electric charge abnormality rule factor library is constructed. The invention reduces the selection of manual features, provides support for realizing more accurate, self-adaptive and suitable for the detection of the abnormal electricity fee in the complex electricity consumption environment, and ensures the diagnosis of the abnormal electricity fee to have more depth and accuracy.
The beneficial effects of the invention also include:
1. the electric charge anomaly detection method is realized based on dynamic user data, and the complex electric charge anomaly mode is accurately captured, so that the accurate detection of the electric charge anomaly is ensured, false alarm and missing report are reduced, and the energy management process is easy. In addition, the method establishes a set of intelligent abnormal problem analysis method, mines abnormal factors influencing accounting from the data, assists business personnel to rapidly develop problem analysis, constructs specific abnormal problem scenes, saves labor cost and improves abnormal analysis efficiency.
2. The method can automatically and effectively multiplex and be compatible with different scenes, and on the basis that different marketing 2.0 systems are adopted in different areas and different accounting rules are adopted, the accounting redundancy is not increased, the workload is not increased, and the fusion and selection of different accounting rules can be provided in a personalized and customized mode.
3. The method does not depend on business personnel to carefully check various parameters related to the calculation fee, ensures accurate abnormality diagnosis, improves abnormality processing efficiency, quickly locates the calculation fee parameters causing abnormal problems, assists the business personnel to complete the problem elimination in time, and ensures the smooth development of the intensive work of accounting provinces. The accounting and collecting mode refers to a service mode transition of electric charge settlement of a power grid company, and the original electric charge calculation and distribution are distributed to all municipal power supply companies. The invention provides a more accurate abnormality diagnosis method meeting the requirement of accounting province, and reduces the pressure brought by invalid diagnosis to manpower.
4. The method has strong generalization capability, strong universality and complex rules, can not cause conflict and error, fully eliminates the interference of low-frequency words and high-frequency words in a knowledge construction model, does not need to label a large amount of corpus, is not easy to be influenced by individual subjective opinion and knowledge structure, and is objective and accurate. Therefore, the factor library improves word segmentation accuracy, reduces artificial feature selection and provides factor importance basis for subsequent diagnosis and analysis of electric charge abnormality. According to analysis, the manual judging pressure is reduced by 50%, false alarm is reduced, abnormal auditing time is shortened, and the workload of a manual auditor is reduced.
5. The decision tree algorithm does not destroy the self-association or non-association characteristics among the abnormal factors, and realizes accurate factor classification and weight extraction on the basis. The weight extraction process can analyze and evaluate the importance of each factor in anomaly detection, and the expected result can enable the electric charge anomaly diagnosis to be more deep and accurate.
6. According to a large number of application tests, the abnormal auditing accuracy is improved by 10%. The improvement of the accuracy is beneficial to reducing the false alarm rate, ensuring more accurate abnormal detection, and enabling the manual auditing to be more concentrated on the real abnormal condition, thereby improving the accuracy of the abnormal auditing. The random forest model can rapidly detect the abnormality in real time or in batch processing, so that the complicated manual adjustment and auditing process in the traditional static rule method is reduced, and the abnormal auditing time is reduced to be within 1 hour.
Drawings
FIG. 1 is a schematic flow chart of an electric charge anomaly analysis method according to the present invention;
FIG. 2 is a schematic diagram of BERT-BiLSTM-CRF model in the electric charge anomaly analysis method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments of the invention are only some, but not all, embodiments of the invention. All other embodiments of the invention not described herein, which are obtained from the embodiments described herein, should be within the scope of the invention by those of ordinary skill in the art without undue effort based on the spirit of the present invention.
Fig. 1 is a schematic flow chart of an electric charge anomaly analysis method according to the present invention. As shown in fig. 1, the first aspect of the present invention relates to a method for analyzing electric charge abnormality, which includes step 1 and step 2.
And step 1, importing an electric charge abnormal rule, training the electric charge abnormal rule, and obtaining an electric charge abnormal rule learning model.
In the invention, in order to obtain an electric charge abnormal rule learning model, an electric charge abnormal rule factor judgment field extraction model based on BERT-BiLSTM-CRF is provided. The model obtains character vector representation of the electric charge abnormal rule through the BERT Chinese pre-training language model, then utilizes BiLSTM to combine with CRF to construct a deep learning model, and fully utilizes the context semantic information to identify the electric charge abnormal rule factor.
Specifically, the method reads an electric charge abnormal rule list table in an electric company marketing system to acquire electric charge abnormal rule data.
Preferably, the electric charge abnormality rule includes a forced abnormality rule and an abnormality prompt rule; the forced abnormal rules comprise a registration abnormal rule, a fee measuring abnormal rule and an archive abnormal rule.
In the electric power marketing, optimizing the electric charge accounting effect can practically ensure that the electric power marketing has higher scientific rationality. In the electric charge verification process, a marketing information system is required to be adopted to verify all electric charges according to a preset verification rule, and whether the abnormal situation exists is accurately judged.
In the above-described process, the content such as the verification rule set in advance may be stored in advance in the electric power company marketing system as the electric power rate abnormality rule. In order to analyze the rule content, the electric charge abnormal rule factor judgment field extraction model based on BERT-BiLSTM-CRF is adopted.
FIG. 2 is a schematic diagram of BERT-BiLSTM-CRF model in the electric charge anomaly analysis method of the present invention. As shown in fig. 2, preferably, the acquiring the electric charge anomaly rule learning model further includes: training the abnormal electric charge rule corpus based on the BERT pre-training language model to obtain vector representation of characters in the abnormal electric charge rule text; and constructing an electric charge anomaly rule learning model by adopting BiLSTM-CRF, training the vector representation, outputting a predictive tag sequence, and confirming a judging field in the electric charge anomaly rule based on the predictive tag sequence.
According to the method, the single character set characteristics output by the BERT model are trained, semantic information among vocabularies in the abnormal electricity fee rule text can be mined more deeply, and information implicit in the context is captured. The BERT model uses a bidirectional transducer as an encoder to extract and train text characteristics, so that each word can fuse information on the left and right adjacent sides. In this model, each character of the input layer can get the corresponding three vectors: word vectors, position vectors and text vectors contain different levels of semantic information. The sum of the three vectors is subjected to a transducer coding unit to extract the context characteristics, and the representation of word level and sentence level is captured, so that the vector representation corresponding to each input is finally obtained.
BiLSTM, the model is formed by combining a forward LSTM and a backward LSTM, namely, two vector representations of the forward LSTM and the backward LSTM are calculated respectively, and then the final BiLSTM vector representation is obtained through vector stitching. The method for bidirectionally acquiring the text characteristic information can better capture the bidirectional semantic dependence, provides more comprehensive semantic co-occurrence information for the learning of the model, and is beneficial to improving the recognition performance of the named entity.
And then, obtaining the labeling type of each character by utilizing a predictive label sequence with the maximum output probability of the CRF layer, extracting and classifying the entities in the sequence, and realizing extraction of the judging field in the electric charge abnormal rule factor. In the CRF model, the CRF layer processes the output result of BiLSTM layers, predicts the probability of the labeling sequence, outputs the labeling sequence with higher probability, and the output result of the label is more accurate.
Preferably, the judgment field realizes the division of judgment types based on the types of the electric charge exception rules; the judging type of the judging field comprises a mandatory class and a prompt class, wherein the mandatory class comprises file abnormality, registration abnormality and metering abnormality.
The method classifies the extracted judging fields, and obtains judging type attributes of the judging fields based on rule type fields of the abnormal rules corresponding to the judging fields, namely if the rule type is a forced abnormal rule, the judging type of the judging fields is a forced type, and if the rule type is an abnormal prompt rule, the judging type is a prompt type. Similarly, the rule is determined based on the abnormality classification of the rule corresponding to the determination field, and specifically, the rule abnormality classification enumeration value has file abnormality, registration abnormality and fee abnormality. In this way, the determination type of the determination field can be obtained.
The content of the judging field corresponds to the abnormal rule factor of the electric charge in the user data, so that the judging result is obtained according to the value of the actual factor of each user.
Preferably, user data is acquired from a data interface of the marketing system, and a judgment factor is extracted from the user data by utilizing a judgment field; based on the judgment factors, constructing judgment factor sets of all users, classifying the users by adopting a random forest model, and generating factor types corresponding to different abnormal rules according to classification results of the random forest model.
First, a sample data set is acquired as an input for model training, the sample data including user data acquired from a data interface of the marketing system, wherein there are user base information, user profile information, user registration information, user electricity rate information, and the like.
The user profile data at least comprises a power consumption type, a marketization attribute, a voltage level, a metering mode, an operation capacity, a contract capacity, a pricing strategy type, a power factor checking mode, a basic electric charge calculation mode, a power quantity, an electric quantity calculation mode, a participation power factor calculation mode, a temporary power consumption sign, an industry category, a power consumption category and a time-sharing power consumption sign of a user; the service change data at least comprises new capacity increasing, suspending, capacity reducing, class changing, metering equipment fault processing, pressure changing, metering equipment changing, suspending and recovering, capacity reducing and recovering and power receiving facility modifying of a user; the price data at least comprises the price type, the power transmission and distribution price, the electricity price, the additional power receiving price, the last meter reading indication, the current meter reading indication, the required quantity indication, the total reactive power, the basic electricity charge, the electricity price and the power adjustment electricity charge of the user.
The decision factor set of all users can be constructed according to the values of the decision factors of each user under different decision fields. Each user in the set may be a multidimensional vector, the number of dimensions being the number of factors, and the value of each dimension on the vector being the value of the factor.
After the decision factor set is built, a random forest model is adopted to realize user classification. The random forest model is used as a classifier, a plurality of classification results are obtained for each user under a plurality of trees, and the set of the classification results can be corresponding to the factor type of a certain characteristic factor of the user. For example, K trees in the K trees implement classification by using a factor m, and at least K classification result labels exist for the factor m, and the multiple labels can comprehensively define the factor type of the current factor of the current user. In one embodiment, the primary classes of factor types are divided into archive factors, registration factors, and royalty factors. It is easily conceivable that in the second or lower hierarchy, classification is also possible for a specific class of rules, a rule. Finally, the most specific abnormality cause of the current user is obtained through a random forest model, and the cause certainly corresponds to a rule.
Therefore, when obtaining a sample data set, a subdivision user group for cashing in modeling needs to be screened, and abnormal users such as testing, sales, and the like are eliminated. Subsequently, for partial factors, data type conversion may be performed. If the variable of the character string type is converted into a computer-recognizable numerical type, the character string enumeration value in the user category field is converted into a computer-recognizable data type enumeration value. In addition, missing value preprocessing is required in consideration of data insufficiency of a part of users. For the missing value processing, the average value or 0 value filling is the main.
Specifically, the missing value processing flow counts the missing rate of each feature, and adopts a deleting processing means aiming at the feature with the missing rate larger than a specified threshold value; and filling according to the data type aiming at the characteristics that the missing rate is not more than a specified threshold value. If the null value is numerical, filling the missing attribute value according to the average value of the values of the attribute in all other objects; if the null value is non-numerical, the value of the missing attribute is complemented with the value of the attribute with the most number of values among all other objects (i.e., the value with the highest frequency of occurrence) according to the mode principle in statistics.
And (5) carrying out standardization processing on the factors with different value ranges. And carrying out standardization processing on the numerical value type characteristics, scaling the characteristics into normal distribution of a standard through the average value and standard deviation of the characteristics, wherein the average value after scaling is 0, and the variance is 1. Normalization is a transformation such as data scaling performed to facilitate the next processing of the data, and the normalized variable value fluctuates up and down around 0, with greater than 0 indicating a higher than average level and less than 0 indicating a lower than average level.
The standardized processing flow is characterized in that original data are converted according to a certain proportion by a mathematical transformation mode, so that the original data fall into a small specific interval, for example, an interval of 0-1 or-1, the difference of characteristic attributes such as properties, dimensions, orders of magnitude and the like among different variables is eliminated, the characteristic attributes are converted into a dimensionless relative value, and the standardized value enables the values of all indexes to be in the same number level, so that the indexes of different units or orders of magnitude can be comprehensively analyzed and compared. The method is to use Z-score normalization method, also called standard deviation normalization method, firstly, the mean and standard deviation of the index need to be calculated, then each actual value of the variable is subtracted by the mean and divided by the standard deviation, namely:
In the process of acquiring a random forest model, the model which is initially constructed may not be accurate enough, so that multiple rounds of training are needed to acquire the model after tuning. The method divides the preprocessed user data to obtain a training set and a testing set respectively. In the foregoing embodiment, according to 4:1, dividing a model training set and a test set.
Preferably, classifying the users by using a random forest model, further comprising: randomly sampling from all users to construct a decision user set; randomly extracting one or more decision factors M1 from the decision factor set of the decision user set, and splitting the users in the decision user set by taking the randomly extracted decision factors as characteristics until a decision tree is obtained; and repeatedly obtaining K decision trees in an iterative mode to form a random forest model in the electric charge abnormal rule learning model.
In the invention, aiming at the divided training set, the Bootstrap idea is utilized to randomly sample the original data set in a put-back way, and the sample size of each sampling is 2N/3; for the sampled samples, M1 (M1 < M) features are randomly extracted as inputs to a training decision tree, and the decision tree is constructed. Repeating the sample sampling, the feature sampling and the decision tree construction step K times to generate K decision trees to form a random forest model.
Preferably, factor type labels are preset for part of all users; and comparing the difference between the output classification result of the random forest model to the partial users and the factor type labels, and iteratively adjusting the random forest model based on the difference.
In the present invention, some users come from the test set, so the users have factor type tags. Model test is carried out based on the divided test set, the output result of the classifier is compared with the difference between the real labels, namely the factor type labels, the performance of the classifier is evaluated, and the parameters of the classifier are adjusted according to the training result, so that the classification accuracy is improved.
Preferably, the method includes randomly extracting one or more decision factors, splitting the users in the user set, and further includes: the factor weight of the judgment factor is predefined by utilizing the frequency statistics result of the abnormal rule factors of the electric charge; and evaluating the importance of each judgment factor by adopting a random forest model, and if the importance of the judgment factor is not matched with the factor weight, adjusting the random forest model until the importance of the judgment factor is matched with the factor weight.
Specifically, in the step of acquiring the user data, the method may synchronously count the frequency of occurrence of the electric charge abnormality rule factor in the electric charge abnormality rule base. For example, it is readily conceivable that if the factor is of a certain pricing policy type, the anomaly frequency may be higher, and the anomaly frequency is lower for another type. If the factor characterizes the user in a state of capacity increase, pause, capacity reduction, the anomaly frequency may be higher, otherwise the anomaly frequency may be lower, and so on.
Therefore, the method can obtain the frequency of the electric charge abnormality rule factor in advance, and the higher the frequency is, the more the possibility of abnormality rule triggered by the factor is, and the abnormality is easier to trigger. The method can be combined with rule factor frequency characteristics to realize optimization of the weight coefficient. The method evaluates the importance of the features by using random forests, and quantifies the contribution of each feature to the classification performance of the constructed K decision trees.
Preferably, the method for evaluating the importance of each judgment factor by using a random forest model further comprises: inputting users in the test set into the random forest model for classification to obtain a first error rate of the random forest model; adding noise on the current judgment factors of each user in the test set, and inputting the noise into the random forest model for classification to obtain a second error rate of the random forest model; and calculating the deviation degree between the first error rate and the second error rate, which corresponds to the current judgment factors, calculating the sum of the deviation degrees of all the judgment factors in the current decision tree, and assigning a value to the importance of each judgment factor based on the sum of the deviation degrees.
For each decision tree t m, a plurality of factors of a plurality of users in a plurality of test sets are input, and the input mode can be to construct a data matrix X OOB, wherein each action is that one user corresponds to the same characteristic factor in the same column. After the matrix is input to the decision tree, the matrix is output as a predicted result Y P, and the mean square error epsilon=mse (Y P-Y)2, i.e., the first error rate) between the predicted value Y P and the true values Y of the presence factor labels.
Adding noise to the current decision factor i on the basis of the error rate, and calculating again to obtain a second error rate Calculating the degree of deviation between the first error rate and the second error rate corresponding to the current determination factor, i.e./>Noise is added to each judgment factor, a second error rate is calculated, and a plurality of deviation degrees/>, are finally obtainedThe number of the deviation degrees is the number of characteristic variables used for realizing splitting in the decision tree t m.
Preferably, the assigning is performed for importance of each of the determination factors based on the degree of deviation, further comprising: the importance of each decision factor is positively correlated with the sum of the degree of deviation.
Specifically, a certain characteristic factor X i may participate in classification of multiple trees, so that the characteristic may participate in splitting process of T trees, and the deviation degree of the factor on the T trees is respectively Finally, the importance of the judgment factor results in that
The difference between the importance of the judging factors and the frequency of the electric charge abnormal rule factors is analyzed in the mode, so that the splitting mode and the splitting result of the decision tree are iteratively improved until the optimal decision tree is obtained. Thus, the model is constructed.
Preferably, the adjusted random forest model is used for constructing an electric charge abnormal rule factor library in a form of four elements of a judgment field, a judgment type, a factor type and a factor weight, and the electric charge abnormal rule factor library is stored.
The acquired judgment fields, judgment types, factor types and factor weights are stored in a four-element form, and the electric charge abnormal rule factor library can be recovered when the electric charge abnormal rule factor library is used.
And 2, inputting user data of the user to be analyzed into the learning model, and outputting an electric charge abnormal state of the user.
The learning model constructed in the step 1 can be adopted to extract and classify the factors of the input user data of the user to be analyzed, so as to find out the abnormal reasons contained in the user data and the quadruple corresponding to the user, and output an abnormal analysis conclusion, namely the abnormal state of the electric charge.
The second aspect of the invention relates to an electric charge abnormity analysis system utilizing the method in the first aspect of the invention, wherein the system comprises a training module and an output module; the training module is used for importing the electric charge abnormal rule, training the electric charge abnormal rule and acquiring an electric charge abnormal rule learning model; and the output module is used for inputting the user data of the user to be analyzed into the learning model and outputting the abnormal state of the electric charge of the user.
A third aspect of the present invention relates to a terminal, comprising a processor and a storage medium; the storage medium is used for storing instructions; the processor is operative to perform the steps of the method of the first aspect of the invention in accordance with the instructions.
The fourth aspect of the present invention relates to a computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the method of the first aspect of the present invention.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.
Claims (15)
1. An electric charge anomaly analysis method is characterized by comprising the following steps:
Step 1, importing an electric charge abnormal rule, training the electric charge abnormal rule, and obtaining an electric charge abnormal rule learning model;
And 2, inputting user data of the user to be analyzed into the learning model, and outputting the abnormal state of the electric charge of the user.
2. The electric charge abnormality analysis method according to claim 1, characterized in that:
The electric charge abnormality rules comprise forced abnormality rules and abnormality prompt rules; wherein,
The forced abnormal rules comprise a registration abnormal rule, a fee measuring abnormal rule and an archive abnormal rule.
3. The electric charge abnormality analysis method according to claim 2, characterized in that:
The acquiring the electric charge abnormal rule learning model further comprises:
Training the abnormal electric charge rule corpus based on the BERT pre-training language model to obtain vector representation of characters in the abnormal electric charge rule text;
And constructing the electric charge abnormal rule learning model by adopting BiLSTM-CRF, training the vector representation, outputting a predictive tag sequence, and confirming a judging field in the electric charge abnormal rule based on the predictive tag sequence.
4. The electricity fee anomaly analysis method according to claim 3, wherein:
the judging field realizes the division of judging types based on the types of the electric charge abnormal rules;
the judging type of the judging field comprises a mandatory class and a prompt class, wherein the mandatory class comprises file abnormality, registration abnormality and metering abnormality.
5. The electric charge abnormality analysis method according to claim 4, characterized in that:
Acquiring user data from a data interface of a marketing system, and extracting a judging factor from the user data by utilizing the judging field;
and constructing a judging factor set of all users based on the judging factors, classifying the users by adopting a random forest model, and generating factor types corresponding to different abnormal rules according to the classification result of the random forest model.
6. The electric charge abnormality analysis method according to claim 5, characterized in that:
the method for classifying the users by adopting the random forest model further comprises the following steps:
randomly sampling from all users to construct a decision user set;
Randomly extracting one or more decision factors from the decision factor set of the decision user set, and splitting the users in the decision user set by taking the randomly extracted decision factors as characteristics until a decision tree is obtained;
And repeatedly obtaining K decision trees in an iterative mode to form a random forest model in the electric charge abnormal rule learning model.
7. The electric charge abnormality analysis method according to claim 6, characterized in that:
Presetting factor type labels for part of all users;
and comparing the difference between the output classification result of the random forest model to partial users and the factor type label, and iteratively adjusting the random forest model based on the difference.
8. The electric charge abnormality analysis method according to claim 7, characterized in that:
The random extraction of one or more decision factors, and splitting the users in the decision user set by taking the randomly extracted decision factors as characteristics, further comprises:
the factor weight of the judging factor is predefined by utilizing the frequency statistics result of the abnormal rule factors of the electric charge;
And evaluating the importance of each judgment factor by adopting a random forest model, and if the importance of the judgment factor is not matched with the factor weight, adjusting the random forest model until the importance of the judgment factor is matched with the factor weight.
9. The electric charge anomaly analysis method according to claim 8, wherein:
the method for evaluating the importance of each judgment factor by adopting the random forest model further comprises the following steps:
Inputting users in the test set into the random forest model for classification to obtain a first error rate of the random forest model;
adding noise to the current judgment factors of each user in the test set, and inputting the noise to the random forest model for classification to obtain a second error rate of the random forest model;
Calculating the deviation degree between the first error rate and the second error rate, which corresponds to the current judgment factors, calculating the sum of the deviation degrees of all the judgment factors in the current decision tree, and assigning a value to the importance of each judgment factor based on the sum of the deviation degrees.
10. The electric charge anomaly analysis method according to claim 8, wherein:
The assigning the importance of each judgment factor based on the deviation degree further comprises:
the importance of each of the decision factors is positively correlated with the sum of the degrees of deviation.
11. The electric charge anomaly analysis method according to claim 10, wherein:
The user data comprises user file data, service change data and price measuring data;
the user profile data at least comprises a power consumption type, a marketization attribute, a voltage level, a metering mode, an operation capacity, a contract capacity, a pricing strategy type, a power factor checking mode, a basic electric charge calculating mode, a power quantity, an electric quantity calculating mode, a participation power factor calculating mode, a temporary power consumption sign, an industry category, a power consumption category and a time-sharing power consumption sign of a user;
The service change data at least comprises new capacity increasing, pause, capacity reducing, class changing, metering equipment fault processing, pressure changing, metering equipment replacement, pause recovery, capacity reducing recovery and power receiving facility transformation of a user;
The price data at least comprises the price type, the power transmission and distribution price, the electricity degree price, the additional power receiving price, the last meter reading indication, the current meter reading indication, the demand indication, the total reactive power, the basic electricity charge, the electricity degree electricity charge and the power adjustment electricity charge of the user.
12. The electric charge abnormality analysis method according to any one of claims 6 to 11, characterized in that:
and constructing an electric charge abnormal rule factor library by the adjusted random forest model in a form of four elements of a judgment field, a judgment type, a factor type and a factor weight, and storing the electric charge abnormal rule factor library.
13. An electric charge abnormality analysis system using the method according to any one of claims 1 to 12, characterized in that:
The system comprises a training module and an output module;
the training module is used for importing an electric charge abnormal rule, training the electric charge abnormal rule and acquiring an electric charge abnormal rule learning model;
And the output module is used for inputting user data of the user to be analyzed into the learning model and outputting the abnormal state of the electric charge of the user.
14. A terminal comprising a processor and a storage medium; the method is characterized in that:
the storage medium is used for storing instructions;
The processor being operative according to the instructions to perform the steps of the method according to any one of claims 1-12.
15. Computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any of claims 1-12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311556590.1A CN117972595A (en) | 2023-11-21 | 2023-11-21 | Method, system, device and medium for analyzing electric charge abnormality |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311556590.1A CN117972595A (en) | 2023-11-21 | 2023-11-21 | Method, system, device and medium for analyzing electric charge abnormality |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117972595A true CN117972595A (en) | 2024-05-03 |
Family
ID=90856846
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311556590.1A Pending CN117972595A (en) | 2023-11-21 | 2023-11-21 | Method, system, device and medium for analyzing electric charge abnormality |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117972595A (en) |
-
2023
- 2023-11-21 CN CN202311556590.1A patent/CN117972595A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110909165A (en) | Data processing method, device, medium and electronic equipment | |
CN109102157A (en) | A kind of bank's work order worksheet processing method and system based on deep learning | |
CN109408574B (en) | Complaint responsibility confirmation system based on text mining technology | |
CN113704389A (en) | Data evaluation method and device, computer equipment and storage medium | |
CN114266455A (en) | Knowledge graph-based visual enterprise risk assessment method | |
CN112200465A (en) | Electric power AI method and system based on multimedia information intelligent analysis | |
CN114816962A (en) | ATTENTION-LSTM-based network fault prediction method | |
CN116756688A (en) | Public opinion risk discovery method based on multi-mode fusion algorithm | |
CN115794803A (en) | Engineering audit problem monitoring method and system based on big data AI technology | |
CN113891342A (en) | Base station inspection method and device, electronic equipment and storage medium | |
CN113674846A (en) | Hospital intelligent service public opinion monitoring platform based on LSTM network | |
CN117932295A (en) | Multi-source data fusion power grid monitoring operation characteristic information extraction method and system | |
CN116739408A (en) | Power grid dispatching safety monitoring method and system based on data tag and electronic equipment | |
CN116501865A (en) | Electric power marketing inspection information analysis system and method | |
CN116842936A (en) | Keyword recognition method, keyword recognition device, electronic equipment and computer readable storage medium | |
CN117972595A (en) | Method, system, device and medium for analyzing electric charge abnormality | |
CN114610882A (en) | Abnormal equipment code detection method and system based on electric power short text classification | |
CN117077680A (en) | Question and answer intention recognition method and device | |
CN114820074A (en) | Target user group prediction model construction method based on machine learning | |
CN113901028A (en) | Power grid system data asset management system | |
CN113935819A (en) | Method for extracting checking abnormal features | |
CN113535820A (en) | Electrical operating personnel attribute presumption method based on convolutional neural network | |
CN118133051B (en) | Construction method and device of element evaluation model | |
CN116049700B (en) | Multi-mode-based operation and inspection team portrait generation method and device | |
CN117875706A (en) | Digital management method for grading process based on AI |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |