CN116304948A - Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles - Google Patents

Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles Download PDF

Info

Publication number
CN116304948A
CN116304948A CN202310307665.6A CN202310307665A CN116304948A CN 116304948 A CN116304948 A CN 116304948A CN 202310307665 A CN202310307665 A CN 202310307665A CN 116304948 A CN116304948 A CN 116304948A
Authority
CN
China
Prior art keywords
fuzzy
sample
attribute
scale
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310307665.6A
Other languages
Chinese (zh)
Inventor
袁钟
陈白杨
彭德中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202310307665.6A priority Critical patent/CN116304948A/en
Publication of CN116304948A publication Critical patent/CN116304948A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles, belonging to the technical field of electric power data analysis, comprising the following steps: carrying out standardization processing on the electricity utilization data to obtain standardized data; selecting an attribute subset and a scale combination, calculating and constructing a multi-scale fuzzy information grain set according to a fuzzy relation matrix, and calculating fuzzy approximate accuracy; calculating fuzzy information grain anomaly factors according to the fuzzy approximate accuracy; calculating outliers of all samples according to the fuzzy information grain abnormality factors; judging whether the outlier degree of the samples is larger than a threshold value one by one, if so, outputting abnormal points of the power consumption data, otherwise, regarding the abnormal points as normal data, and finishing the judgment of all the samples. The invention solves the problems of extraction and fusion of multi-scale data characteristics of the existing power consumption data anomaly detection, can avoid information loss caused by data format conversion, and effectively realizes the unsupervised anomaly detection of the hybrid power data.

Description

Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles
Technical Field
The invention belongs to the technical field of electric power data analysis, and particularly relates to an unsupervised electricity utilization anomaly detection method integrating multi-scale fuzzy information particles.
Background
In the smart grid field at present, abnormal electricity consumption behavior may have serious influence on grid operation and economic benefits of power enterprises. In addition, abnormal electricity utilization can also cause potential power safety hazards, and potential risks are brought to power grid systems and users. Therefore, abnormal electricity utilization detection has important research significance and practical application value in the aspects of guaranteeing the safety of power supply, promoting the reliable operation of a power grid, maintaining the economic benefit of an electric power enterprise and the like. Currently, machine learning-based methods have found wide application in power big data analysis and mining. However, in the abnormal detection of electricity data, most of such methods are good at processing numerical data, format conversion is needed for category data, and important information is easy to lose; the method is only suitable for single-scale data analysis, and the data features from different scales are difficult to comprehensively utilize; a large amount of manual marking data is required, thus limiting the improvement of the performance of the power consumption data anomaly detection algorithm.
Disclosure of Invention
Aiming at the defects in the prior art, the method for detecting the unsupervised power consumption abnormality fused with the multi-scale fuzzy information particles solves the problems of extraction and fusion of multi-scale data characteristics of the conventional power consumption data abnormality detection, can avoid information loss caused by data format conversion, and effectively realizes the unsupervised abnormality detection of the hybrid power data.
In order to achieve the aim of the invention, the invention adopts the following technical scheme: an unsupervised electricity utilization abnormality detection method integrating multi-scale fuzzy information particles comprises the following steps:
s1, acquiring electricity consumption data, and carrying out standardization processing on the electricity consumption data to obtain standardized data;
s2, calculating a fuzzy relation matrix according to the normalized data;
s3, selecting an attribute subset and a scale combination according to the fuzzy relation matrix to construct a multi-scale fuzzy information particle set;
s4, calculating fuzzy approximate accuracy according to the multi-scale fuzzy information particle set;
s5, calculating fuzzy information grain anomaly factors according to the fuzzy approximate accuracy;
s6, calculating outliers of all samples in the normalized data according to the fuzzy information grain anomaly factors;
and S7, judging whether the outlier degree of the samples in the normalized data is larger than a threshold value one by one, if so, outputting abnormal points of the power consumption data, repeating the step S7 until the judgment of all the samples is completed, otherwise, judging the samples as normal data, and judging the next sample until the judgment of all the samples is completed.
The beneficial effects of the invention are as follows: by constructing the fuzzy relation matrix, the problem of information loss caused by data format conversion is avoided; meanwhile, fuzzy information particles are constructed from a plurality of scale layers, so that a multi-scale fuzzy information particle set is obtained, and the problem of extracting multi-scale data features is solved; the uncertainty and the outlier characteristic of the information grain are respectively described through the fuzzy approximate accuracy and the fuzzy information grain anomaly factors, and the outlier of the multi-scale fuzzy information grain is weighted to calculate the outlier of the sample, so that the problem of fusion of the multi-scale data characteristics is solved; the invention does not need to mark data for model training, and can effectively realize the unsupervised anomaly detection of the hybrid power consumption data.
Further, the expression of the normalization processing in the step S1 is:
Figure BDA0004147458420000021
wherein f (·) is normalization;
Figure BDA0004147458420000022
the value of the sample x in the electricity consumption data on the numerical attribute m; m is M m A set of values on a numerical attribute m for all samples in the electricity consumption data; x is a sample in the electricity data; m is a numerical attribute; min is a minimum function; max is a maximum function.
The beneficial effects of the above-mentioned further scheme are: and the numerical data in the electricity consumption data is normalized, so that the data calculation amount is reduced.
Further, the step S2 specifically includes:
s201, setting scale parameters;
s202, obtaining an attribute set according to the normalized data;
s203, calculating the membership degree of the fuzzy relation generated by each attribute according to the scale parameters and the attribute set:
Figure BDA0004147458420000031
Figure BDA0004147458420000032
wherein R is a (x, y) is a membership function, representing that sample x and sample y have fuzzy relationship R a Degree of R (R) a A fuzzy relation generated for attribute a; d, d xy Is the difference in attribute a between samples x and y;
Figure BDA0004147458420000033
a value on attribute a for sample x; />
Figure BDA0004147458420000034
A value on attribute a for sample y; lambda is a scale parameter; absolute; x and y are both samples; a is an attribute;
s204, obtaining a fuzzy relation matrix according to the membership degree of the fuzzy relation generated by each attribute.
The beneficial effects of the above-mentioned further scheme are: by constructing the fuzzy relation matrix, the problem of information loss caused by data format conversion is avoided.
Further, the step S3 specifically includes:
s301, selecting a plurality of attribute subsets according to normalized data;
s302, obtaining membership degrees of fuzzy relations generated by the attribute subsets according to the fuzzy relation matrix:
Figure BDA0004147458420000035
wherein R is B (x, y) membership of the fuzzy relationship generated for attribute subset B; min is a minimum function; x and y are both samples; a is an attribute; r is R a (x, y) is a membership function, representing that sample x and sample y have fuzzy relationship R a Degree of R (R) a A fuzzy relation generated for attribute a;
s303, obtaining the fuzzy relation generated by each attribute subset according to the membership degree of the fuzzy relation generated by each attribute subset;
s304, setting scale combinations, and obtaining a multi-scale fuzzy information particle set according to fuzzy relations generated by the attribute subsets:
Figure BDA0004147458420000041
Figure BDA0004147458420000042
wherein U/R B Is a multi-scale fuzzy information particle set;
Figure BDA0004147458420000043
to be as sample x i As a center, lambda is a fuzzy information particle with a radius; />
Figure BDA0004147458420000044
Membership function for fuzzy information particles; u is a sample set; b is an attribute subset; r is R B Generating a fuzzy relation for the attribute subset B; lambda is a scale parameter; x is x i Is a sample; i is a sample number; y is a sample; />
Figure BDA0004147458420000045
Is a fuzzy relation->
Figure BDA0004147458420000046
Is a membership function of (1) representing sample x i And y has a fuzzy relationship->
Figure BDA0004147458420000047
To a degree of (3).
The beneficial effects of the above-mentioned further scheme are: meanwhile, fuzzy information particles are constructed from a plurality of scale layers, so that a multi-scale fuzzy information particle set is obtained, the problem of extraction of multi-scale data features is solved, and the description of the data features is more accurate.
Further, the expression of the approximation accuracy in the step S4 is:
Figure BDA0004147458420000048
Figure BDA0004147458420000049
Figure BDA00041474584200000410
wherein,,
Figure BDA00041474584200000411
the fuzzy approximate accuracy is obtained; />
Figure BDA00041474584200000412
A fuzzy set which is approximate to the fuzzy information grain;
Figure BDA00041474584200000413
a fuzzy set that is an approximation under fuzzy information grains; />
Figure BDA00041474584200000414
To be as sample x i As a center, the scale parameter lambda is a fuzzy information particle with radius; />
Figure BDA00041474584200000415
A membership function which is approximate to the fuzzy information grain; />
Figure BDA00041474584200000416
A membership function which is approximate to the fuzzy information under the particle; />
Figure BDA0004147458420000051
Sample x with λ as the scale parameter under attribute induction in the relative attribute subset P i And y has a fuzzy relationship->
Figure BDA0004147458420000052
The extent of (3); />
Figure BDA0004147458420000053
Membership function for fuzzy information particles; inf is the infinite; sup is the upper bound; max is a maximum function; min is a minimum function; b is an attribute subset; p is a relative attribute subset; u is a sample set; lambda is a scale parameter; x is x i Is a sample; i is a sample number; y is the sample.
The beneficial effects of the above-mentioned further scheme are: and the uncertainty of the information grain is described through the fuzzy approximate accuracy, and the uncertainty is used as a screening factor of abnormal factors of the fuzzy information grain, so that the screening accuracy of abnormal points is improved.
Further, the expression of the fuzzy information granule abnormality factor in the step S5 is:
Figure BDA0004147458420000054
wherein OF (·) is a calculation function OF fuzzy information grain anomaly factors;
Figure BDA0004147458420000055
to be as sample x i As a center, the scale parameter lambda is a fuzzy information particle with radius; b is an attribute subset; lambda is a scale parameter; x is x i Is a sample; i is a sample number; u is the sampleA present collection; />
Figure BDA0004147458420000056
The fuzzy approximate accuracy is obtained; p is the relative attribute subset.
The beneficial effects of the above-mentioned further scheme are: and the outlier characteristic of the information grain is characterized by fuzzy information grain outlier factors, so that the outlier screening factors are used, and the outlier screening accuracy is improved.
Further, the expression of the outlier in the step S6 is:
Figure BDA0004147458420000057
Figure BDA0004147458420000058
wherein MSOD (·) is the calculation function of the outlier; x is x i Is a sample; i is a sample number; s is the largest number of the scale parameters; m is the maximum number of the attribute set; lambda is a scale parameter; k is the attribute set number; OF (·) is a calculation function OF fuzzy information grain anomaly factors;
Figure BDA0004147458420000059
induced by sample x for the kth attribute subset i As a center, the scale parameter lambda is a fuzzy information particle with radius; />
Figure BDA00041474584200000510
Is a weight mapping function; u is the sample set.
The beneficial effects of the above-mentioned further scheme are: the outlier degree of the sample is calculated by weighting the outlier factors of the multi-scale fuzzy information particles, so that the problem of fusion of the multi-scale data features is solved, and the screening accuracy of outliers is improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
Example 1
As shown in fig. 1, in one embodiment of the present invention, an unsupervised electricity consumption anomaly detection method for fusing multi-scale fuzzy information particles includes the following steps:
s1, acquiring electricity consumption data, and carrying out standardization processing on the electricity consumption data to obtain standardized data;
s2, calculating a fuzzy relation matrix according to the normalized data;
s3, selecting an attribute subset and a scale combination according to the fuzzy relation matrix to construct a multi-scale fuzzy information particle set;
s4, calculating fuzzy approximate accuracy according to the multi-scale fuzzy information particle set;
s5, calculating fuzzy information grain anomaly factors according to the fuzzy approximate accuracy;
s6, calculating outliers of all samples in the normalized data according to the fuzzy information grain anomaly factors;
and S7, judging whether the outlier degree of the samples in the normalized data is larger than a threshold value one by one, if so, outputting abnormal points of the power consumption data, repeating the step S7 until the judgment of all the samples is completed, otherwise, judging the samples as normal data, and judging the next sample until the judgment of all the samples is completed.
In this embodiment, the fuzzy rough set theory provides an effective tool capable of overcoming the discretization problem, and can be directly applied to analysis models of numerical type and category type attributes, so that the problem of information loss caused by data format conversion is avoided. In the specific application of fuzzy asperities, the electricity usage data is imported into an information system (or called an information table) where each row represents an object (or called a sample) and each column represents an attribute (feature) of the object. The values of the attributes may include numerical type (e.g., electricity consumption, voltage, etc.), category type (e.g., user type, connection mode), hybrid type (including both numerical type and category type), etc. One information system is denoted (U, A), where U represents the set of all objects and A represents the set of all attributes.
In the embodiment, fuzzy information grains are firstly constructed from a plurality of scale layers, so that the problem of extracting multi-scale data features is solved; the uncertainty and the outlier characteristic of the information grain are respectively described through the fuzzy approximate accuracy and the fuzzy information grain anomaly factors, and then the outlier factors of the multi-scale fuzzy information grain are weighted to calculate the outlier degree of the sample, so that the fusion problem of the multi-scale data characteristics is solved; the method does not need to mark data for model training, and can effectively realize the unsupervised anomaly detection of the hybrid power consumption data.
The expression of the normalization processing in the step S1 is:
Figure BDA0004147458420000071
wherein f (·) is normalization;
Figure BDA0004147458420000072
the value of the sample x in the electricity consumption data on the numerical attribute m; m is M m A set of values on a numerical attribute m for all samples in the electricity consumption data; x is a sample in the electricity data; m is a numerical attribute; min is a minimum function; max is a maximum function.
In this embodiment, the value range of the numerical data is adjusted to the real number range from 0 to 1 by the min-max normalization operation, and the category data is kept unchanged.
The step S2 specifically comprises the following steps:
s201, setting scale parameters;
s202, obtaining an attribute set according to the normalized data;
s203, calculating the membership degree of the fuzzy relation generated by each attribute according to the scale parameters and the attribute set:
Figure BDA0004147458420000081
Figure BDA0004147458420000082
wherein R is a (x, y) is a membership function, representing that sample x and sample y have fuzzy relationship R a Degree of R (R) a A fuzzy relation generated for attribute a; d, d xy Is the difference in attribute a between samples x and y;
Figure BDA0004147458420000083
a value on attribute a for sample x; />
Figure BDA0004147458420000084
A value on attribute a for sample y; lambda is a scale parameter; absolute; x and y are both samples; a is an attribute;
s204, obtaining a fuzzy relation matrix according to the membership degree of the fuzzy relation generated by each attribute.
In the present embodiment, the fuzzy relation is the core concept of fuzzy rough set theory, which is to define a fuzzy set R defined on UXU, U.times.U.times.0, 1]. For any (x, y) e U, the membership function R (x, y) represents the degree to which object x has a relationship R with object y. The fuzzy relation R on U may be represented by a fuzzy matrix M (R). Any attribute a epsilon R can induce generation of a fuzzy relation R a The membership calculation formula is as follows:
Figure BDA0004147458420000085
wherein,,
Figure BDA0004147458420000086
representing the difference in attribute a between objects x and y. Lambda e (0, 1) is an adjustable parameter. In addition, any attribute subset B ε A can also induce a fuzzy relationship R B The membership calculation formula is as follows:
Figure BDA0004147458420000091
calculating the fuzzy relation matrix M (R) of the arbitrary attribute subset B for the data subjected to the normalization processing B )。
The step S3 specifically comprises the following steps:
s301, selecting a plurality of attribute subsets according to normalized data;
s302, obtaining membership degrees of fuzzy relations generated by the attribute subsets according to the fuzzy relation matrix:
Figure BDA0004147458420000092
wherein R is B (x, y) membership of the fuzzy relationship generated for attribute subset B; min is a minimum function; x and y are both samples; a is an attribute; r is R a (x, y) is a membership function, representing that sample x and sample y have fuzzy relationship R a Degree of R (R) a A fuzzy relation generated for attribute a;
s303, obtaining the fuzzy relation generated by each attribute subset according to the membership degree of the fuzzy relation generated by each attribute subset;
s304, setting scale combinations, and obtaining a multi-scale fuzzy information particle set according to fuzzy relations generated by the attribute subsets:
Figure BDA0004147458420000093
Figure BDA0004147458420000094
wherein U/R B Is a multi-scale fuzzy information particle set;
Figure BDA0004147458420000095
to be as sample x i As a center, lambda is a fuzzy information particle with a radius; />
Figure BDA0004147458420000096
Membership function for fuzzy information particles; u is a sample set; b is an attribute subset; r is R B Generating a fuzzy relation for the attribute subset B; lambda is a scale parameter; x is x i Is a sample; i is a sample number; y is a sample; />
Figure BDA0004147458420000097
Is a fuzzy relation->
Figure BDA0004147458420000098
Is a membership function of (1) representing sample x i And y has a fuzzy relationship->
Figure BDA0004147458420000099
To a degree of (3).
In this embodiment, in the fuzzy rough set theory, fuzzy information grain (or "information grain") is a set of a series of objects that are aggregated by fuzzy relations between the objects. The method extracts the multi-scale data features by constructing fuzzy information grains from a plurality of scale layers, and specifically comprises the following steps of:
step 1: attribute subset selection
Any subset of attributes may construct a set of fuzzy relationships from which a plurality of fuzzy information particles of the object may be further constructed. To simplify the calculation, only one attribute is used to construct the information grain (i.e., the basic fuzzy information grain). From the global attribute a= { a 1 ,a 2 ,…,a m One attribute a is selected successively k Form an attribute subset B k ={a k -obtaining m number of attribute subsets
Figure BDA0004147458420000101
Step 2: information granulation
The process of construction of the information granule is also called information granulation. Information granulation is the process of dividing a population of objects into particles of different sizes (i.e., granularity). In fuzzy rough set theory, a group of objects can be aggregated to form an information grain through fuzzy relation among the objects. By using arbitrary fuzzy relation R B Dividing the whole object U to obtain a generalized fuzzy equivalence class set, namely a multi-scale fuzzy information particle set:
Figure BDA0004147458420000102
wherein,,
Figure BDA0004147458420000103
for an object x i A fuzzy information particle with a center and lambda as a radius. Obviously, information grain->
Figure BDA0004147458420000104
Is a fuzzy set on U, and the membership function is:
Figure BDA0004147458420000105
information granule
Figure BDA0004147458420000106
The calculation formula of the base of (a) is as follows:
Figure BDA0004147458420000107
step 3: multi-scale fuzzy information granule structure
A multiscale fuzzy granule is a collection of fuzzy granules having a plurality of granule radii. Given a set of scales Λ= { λ 12 ,…,λ S For any sample x i And respectively selecting S information grain radii lambda epsilon lambda under m attribute subsets B epsilon beta, and obtaining a group of multi-scale fuzzy information grains of the sample through information granulation.
The expression of the fuzzy approximate accuracy in the step S4 is as follows:
Figure BDA0004147458420000111
Figure BDA0004147458420000112
Figure BDA0004147458420000113
wherein,,
Figure BDA0004147458420000114
the fuzzy approximate accuracy is obtained; />
Figure BDA0004147458420000115
A fuzzy set which is approximate to the fuzzy information grain;
Figure BDA0004147458420000116
a fuzzy set that is an approximation under fuzzy information grains; />
Figure BDA0004147458420000117
To be as sample x i As a center, the scale parameter lambda is a fuzzy information particle with radius; />
Figure BDA0004147458420000118
A membership function which is approximate to the fuzzy information grain; />
Figure BDA0004147458420000119
A membership function which is approximate to the fuzzy information under the particle; />
Figure BDA00041474584200001110
Sample x with λ as the scale parameter under attribute induction in the relative attribute subset P i And y has a fuzzy relationship->
Figure BDA00041474584200001111
The extent of (3); />
Figure BDA00041474584200001112
Membership function for fuzzy information particles; inf is the infinite; sup is the upper bound; max is a maximum function; min is a minimum function; b is an attribute subset; p is a relative attribute subset; u is a sample set; lambda is a scale parameter; x is x i Is a sample; i is a sample number; y is the sample.
In this embodiment, the fuzzy approximation accuracy is a measure of uncertainty of information contained in fuzzy information particles, and the lower the value is, the more the information particles have distinguishing capability, and the expression is:
Figure BDA00041474584200001113
wherein,,
Figure BDA00041474584200001114
is a subset of attributes->
Figure BDA00041474584200001115
The constructed relative fuzzy relation is used for evaluating the approximate accuracy of the information grain. For ease of calculation, the attributes calculated in step 1 are selected (i.e., P ε -BETA).
Figure BDA00041474584200001116
And->
Figure BDA00041474584200001117
Is representative of->
Figure BDA00041474584200001118
A pair of fuzzy sets representing that the object is positively affiliated to the information grain +.>
Figure BDA00041474584200001119
And possibly belonging to the information granule->
Figure BDA00041474584200001120
The calculation formula is as follows:
Figure BDA00041474584200001121
Figure BDA00041474584200001122
the expression of the fuzzy information granule abnormality factor in the step S5 is as follows:
Figure BDA0004147458420000121
wherein OF (·) is a calculation function OF fuzzy information grain anomaly factors;
Figure BDA0004147458420000122
to be as sample x i As a center, the scale parameter lambda is a fuzzy information particle with radius; b is an attribute subset; lambda is a scale parameter; x is x i Is a sample; i is a sample number; u is a sample set; />
Figure BDA0004147458420000123
The fuzzy approximate accuracy is obtained; />
Figure BDA0004147458420000124
Input parameters for fuzzy approximate accuracy; p is the relative attribute subset.
In this embodiment, the outlier factor is an indicator that measures the outlier level of fuzzy outliers, and the core idea is that if a sample is attached to a more unique outlier (i.e., the lower the fuzzy approximation accuracy), it is more likely to be outliers. The anomaly factors of the information grains of the whole sample under multiple scales need to be calculated.
The expression of the outlier in the step S6 is:
Figure BDA0004147458420000125
Figure BDA0004147458420000126
wherein MSOD (·) is the calculation function of the outlier; x is x i Is a sample; i is a sample number; s is the largest number of the scale parameters; m is the maximum number of the attribute set; lambda is a scale parameter; k is the attribute set number; OF (·) is a calculation function OF fuzzy information grain anomaly factors;
Figure BDA0004147458420000127
induced by sample x for the kth attribute subset i As a center, the scale parameter lambda is a fuzzy information particle with radius; />
Figure BDA0004147458420000128
Is a weight mapping function; u is the sample set.
In this embodiment, the sample outlier is used to measure the likelihood that a sample belongs to an outlier, and the outlier features from the multiple scale fuzzy information particles are fused by weighting the multi-scale information particle outlier factors.
In this embodiment, let the outlier threshold be ζ ε (0, 1), if sample x i Is an outlier MSOD (x) i )>ζ, then x i I.e. determined as an outlier. The outliers of the whole samples are compared with the threshold value xi one by one to calculate the abnormal samples in the system.
Example 2
In this embodiment, an information table containing 6 samples and 2 attributes is given, as shown in the left side (right side is the normalized result) of table 1.
TABLE 1 raw data and normalized results
Figure BDA0004147458420000131
Step 1: and (5) data normalization processing.
To facilitate subsequent computation, a min-max normalization operation is performed on the input data to convert the numerical data into 0-1, and the type attribute a 1 Remain unchanged. The results of the treatment are shown on the right side of table 1.
Step 2: and calculating a fuzzy relation matrix.
Let the scale parameter of the fuzzy relation be Λ= { λ 12 },λ 1 =0.2,λ 2 =0.3. The following fuzzy relation matrix is calculated:
Figure BDA0004147458420000132
Figure BDA0004147458420000133
Figure BDA0004147458420000141
step 3: and constructing multi-scale fuzzy information particles.
Uncertainty and outlier characteristics of data are described by constructing a multi-scale fuzzy information grain. Selecting attribute subset B 1 ={a 1 },B 2 ={a 2 Calculation of sample x i From fuzzy relation R B1 At lambda 1 Fuzzy information grain on scale
Figure BDA0004147458420000142
Next, sample x 1 For example, by fuzzy relation matrix +.>
Figure BDA0004147458420000143
Easy to get->
Figure BDA0004147458420000144
Similarly, by fuzzy relation matrix->
Figure BDA0004147458420000145
Available->
Figure BDA0004147458420000146
Equally available, add->
Figure BDA0004147458420000147
Step 4: and calculating the fuzzy approximate accuracy.
Uncertainty of information contained in the multi-scale fuzzy information particles is measured by calculating the approximate accuracy of the multi-scale fuzzy information particles. Selecting a subset of relative attributes P 1 ={a 2 },P 2 ={a 1 And (3) calculating the upper and lower approximations of any fuzzy information grain. In sample x 1 For example, it is possible to obtain:
Figure BDA0004147458420000148
Figure BDA0004147458420000149
Figure BDA00041474584200001410
Figure BDA00041474584200001411
Figure BDA00041474584200001412
Figure BDA00041474584200001413
thus, the first and second substrates are bonded together,
Figure BDA00041474584200001414
in the same way, the method can be used for preparing the composite material,
Figure BDA00041474584200001415
Figure BDA00041474584200001416
thus, the first and second substrates are bonded together,
Figure BDA0004147458420000151
similarly, a->
Figure BDA0004147458420000152
Available sample x 1 At the scale lambda 1 Lower information granule
Figure BDA0004147458420000153
Relative to->
Figure BDA0004147458420000154
The approximate accuracy of (2) is:
Figure BDA0004147458420000155
step 5: and calculating fuzzy information grain abnormality factors.
The degree of abnormality of the multi-scale fuzzy information particles is quantitatively described by calculating the abnormality factors of the multi-scale fuzzy information particles. In sample x 1 For example, calculate its scale λ 1 Lower information granule
Figure BDA0004147458420000156
The anomaly factors of (2) are:
Figure BDA0004147458420000157
the corresponding weight coefficients can be obtained as follows:
Figure BDA0004147458420000158
step 6: the outliers of the samples are calculated.
Calculating the outlier of the sample by weighted summing a set of multi-scale fuzzy information grain anomaly factors for the sample:
Figure BDA0004147458420000159
MSOD(x 2 )≈0.291;
MSOD(x 3 )≈0.247;
MSOD(x 4 )≈0.243;
MSOD(x 5 )≈0.224;
MSOD(x 6 )≈0.185。
step 7: the abnormality determination is performed by threshold comparison.
Comparing outliers of the population of objects, it is apparent that sample x 1 Is significantly higher than other objects. Let the threshold value of outlier xi be 0.3, and compare the outlier of the whole object with the threshold value xi, then x 1 Is determined as an abnormal point.

Claims (7)

1. An unsupervised electricity utilization abnormality detection method integrating multi-scale fuzzy information particles is characterized by comprising the following steps:
s1, acquiring electricity consumption data, and carrying out standardization processing on the electricity consumption data to obtain standardized data;
s2, calculating a fuzzy relation matrix according to the normalized data;
s3, selecting an attribute subset and a scale combination according to the fuzzy relation matrix to construct a multi-scale fuzzy information particle set;
s4, calculating fuzzy approximate accuracy according to the multi-scale fuzzy information particle set;
s5, calculating fuzzy information grain anomaly factors according to the fuzzy approximate accuracy;
s6, calculating outliers of all samples in the normalized data according to the fuzzy information grain anomaly factors;
and S7, judging whether the outlier degree of the samples in the normalized data is larger than a threshold value one by one, if so, outputting abnormal points of the power consumption data, repeating the step S7 until the judgment of all the samples is completed, otherwise, judging the samples as normal data, and judging the next sample until the judgment of all the samples is completed.
2. The method for detecting the unsupervised power consumption abnormality by fusing multi-scale fuzzy information particles according to claim 1, wherein the expression of the normalization processing in the step S1 is as follows:
Figure FDA0004147458410000011
wherein f (·) is normalization;
Figure FDA0004147458410000012
the value of the sample x in the electricity consumption data on the numerical attribute m; m is M m A set of values on a numerical attribute m for all samples in the electricity consumption data; x is a sample in the electricity data; m is a numerical attribute; min is a minimum function; max is a maximum function.
3. The method for detecting the unsupervised power consumption abnormality by fusing multi-scale fuzzy information particles according to claim 1, wherein the step S2 is specifically:
s201, setting scale parameters;
s202, obtaining an attribute set according to the normalized data;
s203, calculating the membership degree of the fuzzy relation generated by each attribute according to the scale parameters and the attribute set:
Figure FDA0004147458410000021
Figure FDA0004147458410000022
wherein R is a (x, y) is a membership function, representing that sample x and sample y have fuzzy relationship R a Degree of R (R) a A fuzzy relation generated for attribute a; d, d xy Is the difference in attribute a between samples x and y;
Figure FDA0004147458410000023
a value on attribute a for sample x; />
Figure FDA0004147458410000024
A value on attribute a for sample y; lambda is a scale parameter; absolute; x and y are both samples; a is an attribute;
s204, obtaining a fuzzy relation matrix according to the membership degree of the fuzzy relation generated by each attribute.
4. The method for detecting the unsupervised power consumption abnormality by fusing multi-scale fuzzy information particles according to claim 1, wherein the step S3 is specifically:
s301, selecting a plurality of attribute subsets according to normalized data;
s302, obtaining membership degrees of fuzzy relations generated by the attribute subsets according to the fuzzy relation matrix:
Figure FDA0004147458410000025
wherein R is B (x, y) membership of the fuzzy relationship generated for attribute subset B; min is a minimum function; x and y are both samples; a is an attribute; r is R a (x, y) is a membership function, representing that sample x and sample y have fuzzy relationship R a Degree of R (R) a A fuzzy relation generated for attribute a;
s303, obtaining the fuzzy relation generated by each attribute subset according to the membership degree of the fuzzy relation generated by each attribute subset;
s304, setting scale combinations, and obtaining a multi-scale fuzzy information particle set according to fuzzy relations generated by the attribute subsets:
Figure FDA0004147458410000031
Figure FDA0004147458410000032
wherein U/R B Is a multi-scale fuzzy information particle set;
Figure FDA0004147458410000033
to be as sample x i As a center, lambda is a fuzzy information particle with a radius;
Figure FDA0004147458410000034
membership function for fuzzy information particles; u is a sample set; b is an attribute subset; r is R B Generating a fuzzy relation for the attribute subset B; lambda is a scale parameter; x is x i Is a sample; i is a sample number; y is a sample; />
Figure FDA0004147458410000035
Is a fuzzy relation->
Figure FDA0004147458410000036
Membership function of (2)Representing sample x i And y has a fuzzy relationship->
Figure FDA0004147458410000037
To a degree of (3).
5. The method for detecting the unsupervised power consumption abnormality fused with the multi-scale fuzzy information granule according to claim 1, wherein the expression of the simulation approximation accuracy in the step S4 is as follows:
Figure FDA0004147458410000038
Figure FDA0004147458410000039
Figure FDA00041474584100000310
wherein,,
Figure FDA00041474584100000311
the fuzzy approximate accuracy is obtained; />
Figure FDA00041474584100000312
A fuzzy set which is approximate to the fuzzy information grain;
Figure FDA00041474584100000313
a fuzzy set that is an approximation under fuzzy information grains; />
Figure FDA00041474584100000314
To be as sample x i As a center, the scale parameter lambda is a fuzzy information particle with radius; />
Figure FDA00041474584100000319
A membership function which is approximate to the fuzzy information grain; />
Figure FDA00041474584100000315
A membership function which is approximate to the fuzzy information under the particle; />
Figure FDA00041474584100000316
Sample x with λ as the scale parameter under attribute induction in the relative attribute subset P i And y has a fuzzy relationship->
Figure FDA00041474584100000317
The extent of (3); />
Figure FDA00041474584100000318
Membership function for fuzzy information particles; inf is the infinite; sup is the upper bound; max is a maximum function; min is a minimum function; b is an attribute subset; p is a relative attribute subset; u is a sample set; lambda is a scale parameter; x is x i Is a sample; i is a sample number; y is the sample.
6. The method for detecting the unsupervised power consumption abnormality fused with the multi-scale fuzzy information granule according to claim 1, wherein the expression of the fuzzy information granule abnormality factor in the step S5 is as follows:
Figure FDA0004147458410000041
wherein OF (·) is a calculation function OF fuzzy information grain anomaly factors;
Figure FDA0004147458410000042
to be as sample x i As a center, the scale parameter lambda is a fuzzy information particle with radius; b is an attribute subset; lambda is a scale parameter; x is x i Is a sample; i is a sample number; u is a sample set;
Figure FDA0004147458410000043
the fuzzy approximate accuracy is obtained; p is the relative attribute subset.
7. The method for detecting the unsupervised power consumption abnormality by fusing multi-scale fuzzy information particles according to claim 1, wherein the expression of the outlier in the step S6 is:
Figure FDA0004147458410000044
Figure FDA0004147458410000045
wherein MSOD (·) is the calculation function of the outlier; x is x i Is a sample; i is a sample number; s is the largest number of the scale parameters; m is the maximum number of the attribute set; lambda is a scale parameter; k is the attribute set number; OF (·) is a calculation function OF fuzzy information grain anomaly factors;
Figure FDA0004147458410000046
induced by sample x for the kth attribute subset i As a center, the scale parameter lambda is a fuzzy information particle with radius;
Figure FDA0004147458410000047
is a weight mapping function; u is the sample set.
CN202310307665.6A 2023-03-27 2023-03-27 Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles Pending CN116304948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310307665.6A CN116304948A (en) 2023-03-27 2023-03-27 Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310307665.6A CN116304948A (en) 2023-03-27 2023-03-27 Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles

Publications (1)

Publication Number Publication Date
CN116304948A true CN116304948A (en) 2023-06-23

Family

ID=86795790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310307665.6A Pending CN116304948A (en) 2023-03-27 2023-03-27 Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles

Country Status (1)

Country Link
CN (1) CN116304948A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117591971A (en) * 2023-07-10 2024-02-23 国网四川省电力公司营销服务中心 Unsupervised electricity larceny detection method based on multi-granularity fuzzy relative difference

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117591971A (en) * 2023-07-10 2024-02-23 国网四川省电力公司营销服务中心 Unsupervised electricity larceny detection method based on multi-granularity fuzzy relative difference

Similar Documents

Publication Publication Date Title
CN112699913B (en) Method and device for diagnosing abnormal relationship of household transformer in transformer area
CN111199016B (en) Daily load curve clustering method for improving K-means based on DTW
CN107229602B (en) Method for identifying electricity consumption behavior of intelligent building microgrid
CN108593990B (en) Electricity stealing detection method based on electricity consumption behavior mode of electric energy user and application
CN107169628B (en) Power distribution network reliability assessment method based on big data mutual information attribute reduction
CN111612650B (en) DTW distance-based power consumer grouping method and system
CN109409628B (en) Acquisition terminal manufacturer evaluation method based on metering big data clustering model
CN106022509B (en) Consider the Spatial Load Forecasting For Distribution method of region and load character double differences
CN109891508B (en) Single cell type detection method, device, apparatus and storage medium
CN115276006A (en) Load prediction method and system for power integration system
CN105807231B (en) A kind of method and system for remaining battery capacity detection
CN111340065B (en) User load electricity stealing model mining system and method based on complex user behavior analysis
CN111639882B (en) Deep learning-based electricity risk judging method
CN105786711A (en) Data analysis method and device
CN112199862B (en) Nanoparticle migration prediction method, influence factor analysis method and system
CN116304948A (en) Unsupervised electricity consumption anomaly detection method integrating multi-scale fuzzy information particles
CN110738232A (en) grid voltage out-of-limit cause diagnosis method based on data mining technology
CN111126499A (en) Secondary clustering-based power consumption behavior pattern classification method
CN110796159A (en) Power data classification method and system based on k-means algorithm
CN112287980A (en) Power battery screening method based on typical feature vector
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN112305441A (en) Power battery health state assessment method under integrated clustering
CN115719177A (en) Regional power quality comprehensive evaluation method considering time sequence
CN109389517B (en) Analysis method and device for quantifying line loss influence factors
CN108830407A (en) Sensor distribution optimization method under the conditions of multi-state in monitoring structural health conditions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination