CN117092582A

CN117092582A - Electric energy meter abnormality detection method and device based on contrast self-encoder

Info

Publication number: CN117092582A
Application number: CN202310990073.9A
Authority: CN
Inventors: 于家豪; 高欣; 李保丰; 翟峰; 赵兵; 郜波; 秦煜; 陈昊; 梁晓兵; 许斌; 徐萌; 卢建生; 肖春; 任宇路; 杨帅; 贾勇; 焦广旭
Original assignee: Beijing University of Posts and Telecommunications; China Electric Power Research Institute Co Ltd CEPRI; Marketing Service Center of State Grid Shanxi Electric Power Co Ltd
Current assignee: Beijing University of Posts and Telecommunications; China Electric Power Research Institute Co Ltd CEPRI; Marketing Service Center of State Grid Shanxi Electric Power Co Ltd
Priority date: 2023-08-08
Filing date: 2023-08-08
Publication date: 2023-11-21

Abstract

The invention discloses an electric energy meter abnormality detection method and device based on a contrast self-encoder. The method comprises the following steps: acquiring multi-variable long-time sequence data of historical detection of the electric energy meter to be detected; normalizing the multivariable long-time sequence data, and dividing a plurality of time window data with preset window length; inputting a plurality of time window data into a pre-trained anomaly detection model, outputting reconstruction data corresponding to each time window data, and adopting a contrast self-encoder in the anomaly detection model; determining the anomaly score of each time point of the time window data according to the reconstruction data of each time window data and determining the anomaly degree of each time point according to the anomaly score.

Description

Electric energy meter abnormality detection method and device based on contrast self-encoder

Technical Field

The invention relates to the technical field of electric energy meter detection, in particular to an electric energy meter abnormality detection method and device based on a contrast self-encoder.

Background

Under the background of meeting the requirements of the verification mode of the intelligent electric meter which is gradually changed, the method for detecting the multidimensional time sequence data abnormality of the intelligent electric meter is researched, so that the safe and stable operation of the intelligent electric network can be protected, meanwhile, research results can provide theoretical support and reference function for the multidimensional time sequence data abnormality detection of various industrial entity devices, and the method contributes to achieving the national 'double-carbon' target and promoting the sustainable development of society and economy. In an anomaly detection task, anomalies may be defined as patterns of anomalies that do not correspond to expected behavior. The time series and the non-time series differ in data in that the non-time series considers each data independent of each other, and each data in the time series has an association relationship. Namely, the anomaly detection of the non-time series belongs to outlier detection to a certain extent, and is a process of finding out scattered points which do not accord with the normal mode by using the correlation operation based on the distance. However, the time sequence anomaly detection needs to consider sequential logic and recurrence relation between sampling time points in each sequence, and the mining of a single point cannot extract time dependency in the sequence.

Because the data collected from the actual ammeter entity equipment is mostly lack of accurate normal and abnormal labels, the manual calibration difficulty and cost of the actual ammeter entity equipment are high, and because the equipment robustness is high, the actually collected data are almost all normal data, and therefore, some supervised methods are limited in practical application due to lack of priori information, so that the main research work is concentrated on unsupervised abnormal detection without accurate category information marks, and most research assumption training data are normal.

The current multi-dimensional time sequence unsupervised anomaly detection method of the intelligent ammeter can be divided into three main categories: statistical methods, machine learning methods, and deep learning methods. Because the statistical method requires data to meet the set statistical assumption, the real data often presents complex and unknown distribution, so that an effective statistical model is difficult to construct; the traditional machine learning model is difficult to directly capture the context association in the time sequence and difficult to model the complex mode of the multi-dimensional time sequence; in the field of smart grids, time series data of industrial equipment such as an electric energy meter gradually become complex and random along with the development of the age. Various systems continue to generate large amounts of data and deep learning methods have been developed that require extensive data support training. In recent years, timing anomaly detection based on deep learning has become an important direction. In terms of handling anomaly detection problems, a large number of unsupervised deep anomaly detection methods have been developed at present and exhibit significantly better performance than conventional anomaly detection. From the whole, the method based on deep learning has relatively high accuracy in the problem of multidimensional time series anomaly detection, and a model processing mode is relatively convenient.

The existing reconstruction-based multidimensional time series anomaly detection method focuses on learning point-by-point context information of multidimensional time series data, and focuses less on long-term overall change trend of time series. The smart meter multidimensional time series data has complex time dependency and dimensional correlation, and the normal mode of the smart meter can dynamically change along with time, and the existing reconstruction-based method cannot construct an accurate profile for the normal mode of the multivariable time series data, so that the further improvement of the performance of the smart meter is limited.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an electric energy meter abnormality detection method and device based on a contrast self-encoder.

According to one aspect of the present invention, there is provided a method for detecting an abnormality of an electric energy meter based on a contrast self-encoder, comprising:

acquiring multi-variable long-time sequence data of historical detection of the electric energy meter to be detected;

normalizing the multivariable long-time sequence data, and dividing a plurality of time window data with preset window length;

inputting a plurality of time window data into a pre-trained anomaly detection model, outputting reconstruction data corresponding to each time window data, and adopting a contrast self-encoder in the anomaly detection model;

Determining the anomaly score of each time point of the time window data according to the reconstruction data of each time window data and determining the anomaly degree of each time point according to the anomaly score.

Optionally, the multivariate long time series data comprises: phase a current, phase B current, phase C current, phase a voltage, phase B voltage, phase C voltage, forward active power indication, reverse active power indication, forward reactive power indication, reverse reactive power indication, phase a active power, phase B active power, phase C active power, active power total, phase a reactive power, phase B reactive power, phase C reactive power, reactive power total, phase a power factor, phase B power factor, phase C power factor, power factor total.

Optionally, after obtaining the multivariate long-time series data of the history detection of the electric energy meter to be detected, the method further comprises:

all values of each variable in the multivariate long-time series data are made to conform to a standard normal distribution using Z-Score normalization, wherein

The formula for Z-Score normalization is:

wherein X is _i For the multi-variable long-time series data,representing normalized X _i Mu represents X _i Vector of the mean value of all sample data for each variable, σ represents X _i A vector of standard deviations of all sample data for each variable.

Optionally, the training process of the anomaly detection model is as follows:

acquiring multiple variable time series data samples of historical detection of a plurality of electric energy meters, and combining the multiple variable time series data samples into one variable long time series data sample;

normalizing all values of each variable in the multivariate long-time series data samples to a standard normal distribution using Z-Score;

windowing the standardized multivariable long-time sequence data samples, and dividing the standardized multivariable long-time sequence data samples into a plurality of time window data samples with preset windows;

after a plurality of time window data samples are input to a projection layer, a time stamp mask is applied to carry out data enhancement, and positive and negative sample pairs are determined;

inputting the positive and negative sample pairs into an encoder for feature extraction, and determining positive and negative sample pair features;

and performing model training on the characteristics, the encoder, the discriminator and the decoder by using the positive and negative samples to determine an abnormality detection model.

Optionally, the data enhancement is performed by applying a timestamp mask after inputting a plurality of time window data samples to the projection layer, and determining positive and negative sample pairs includes:

inputting the time window data samples into a projection layer, and obtaining a projection representation of each sample;

Acquiring positive and negative samples of each time window data sample using a multi-scale timestamp mask and random sampling;

setting a plurality of enhanced samples of one sample obtained by different shielding probabilities in a timestamp shielding, and determining positive sample pairs;

and respectively extracting a predetermined number of samples from other original time window data samples and each type of enhancement samples to serve as negative samples of the original time window data, and determining a negative sample pair of each sample.

Optionally, the positive and negative sample pairs are input to an encoder for feature extraction, and the process of determining the positive and negative sample pair features is expressed as:

z＝E(a)

wherein E (& gt) representsThe table encoder, a is the projection representation or the enhanced sample thereof obtained after the original time sequence window data is input into the projection layer, and z is the hidden vector of a, namely the characteristics of the original sample or the positive and negative samples, and z is E R ^h H is the length of the hidden vector.

Optionally, model training is performed on the feature, the encoder, the discriminator, and the decoder using positive and negative samples, and determining an anomaly detection model includes:

performing feature recombination on the positive and negative sample pair features to determine positive and negative sample pair combination features;

inputting the positive and negative sample pair characteristics and the positive and negative sample pair combined characteristics into a discriminator to obtain a prediction result and discriminator loss;

Updating the discriminator to minimize discriminator loss, and acquiring the prediction result again;

inputting the prediction result to an encoder, and determining the encoding result and the encoding loss;

updating the encoder, minimizing the encoding loss, and acquiring the encoding result again;

reconstructing positive and negative sample pair characteristics according to the encoder and the decoder, and determining reconstruction loss;

updating the encoder and decoder, minimizing reconstruction losses, and determining an anomaly detection model.

Optionally, the process of feature recombination of the positive and negative samples is as follows:

where α and β are the combined parameters that determine the contribution of each old feature in the blended feature, z is the hidden vector representation of a training sample, and z ⁺ ，z ^- Respectively its hidden vector representation of a positive and negative sample,positive and negative combined features, respectively;

loss function L of discriminator _d The method comprises the following steps:

where N is the number of training windows,and->Representing the decomposition result of the discriminator on the jth positive hybrid feature and the kth negative hybrid feature of the ith sample, respectively, l _d ⁺ And l ^- Is a real tag, wherein l _d ⁺ ＝(1,α)，l ^- = (0, β), m is the positive number of samples for one sample, n+n×m is the negative number of samples, α and β are the combined parameters that determine each old feature contribution in the blended feature, β=1- α;

Loss function L of encoder _E The method comprises the following steps:

in the method, in the process of the invention,and->Representing the decomposition result of the discriminator on the jth positive hybrid feature and the kth negative hybrid feature of the ith sample, respectively, l _E ⁺ And l ^- Is a real tag, wherein l _E ⁺ ＝(1,1)，l ^- = (0, β), m is the positive number of samples for one sample, n+n×m is the negative number of samples;

reconstructing the loss function L _recon The method comprises the following steps:

where N is the number of training windows, w is the window size,and x _i,t The reconstructed data and the original data at time step t in the ith window, respectively.

Alternatively, the calculation formula of the anomaly score is:

wherein S is _i,t Is the anomaly score for time step t of the ith test data,and x _i,t The reconstructed data and the original data at time step t in the ith window, respectively.

According to another aspect of the present invention, there is provided an abnormality detection device for an electric energy meter based on a contrast self-encoder, comprising:

the acquisition module is used for acquiring multi-variable long-time sequence data of the history detection of the electric energy meter to be detected;

the dividing module is used for carrying out normalization processing on the multivariable long-time sequence data and dividing a plurality of time window data with preset window length;

the output module is used for inputting a plurality of time window data into a pre-trained anomaly detection model, outputting reconstruction data corresponding to each time window data, and adopting an anti-contrast self-encoder in the anomaly detection model;

The determining module is used for determining the abnormality score of each time point of the time window data according to the reconstruction data of each time window data and determining the abnormality degree of each time point according to the abnormality score.

According to a further aspect of the present application there is provided a computer readable storage medium storing a computer program for performing the method according to any one of the above aspects of the present application.

According to still another aspect of the present application, there is provided an electronic device including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method according to any of the above aspects of the present application.

Therefore, according to the problems that the overall trend of data is difficult to learn by the existing intelligent ammeter multivariable time sequence anomaly detection method based on reconstruction and the data enhancement and agent task of the existing contrast learning method are not suitable for time sequence anomaly detection, the application provides an antagonistic contrast self-encoder framework (ACAE) for the multivariable time sequence anomaly detection of the ammeter. ACAE solves the problem that the prior method is difficult to learn the advanced semantic features of data under the condition that the multi-variable time series data has complex time dependency, dimension dependency and dynamic property. The ACAE improves the robustness of the model to normal data modeling by introducing a contrast learning constraint from the hidden space of the encoder, so that the anomaly is easier to identify. ACAE obtains multiple enhanced views of the original sample using a multi-variable time series data enhancement method based on a multi-scale timestamp mask to encourage the model to model its overall trend with multi-scale information of the time series. The ACAE then performs a contrast learning based on the feature combination and the decomposed proxy task. Robustness of the hidden variable of the encoder is improved by introducing an antagonistic training process. The framework avoids the problem of effective information loss possibly caused by the fact that the existing contrast learning framework is used for multivariate time series data. By the combined training of the comparison learning task and the reconstruction task, the model can simultaneously consider point-by-point context information and overall trend information in the multivariate data of the intelligent electric energy meter, and generalization capability of the model and accuracy of anomaly detection are improved.

Drawings

Exemplary embodiments of the present invention may be more completely understood in consideration of the following drawings:

FIG. 1 is a flow chart of a method for detecting anomalies in a power meter based on a contrast-resistant self-encoder according to an exemplary embodiment of the present invention;

FIG. 2 is a block diagram of an anomaly detection module of a self-encoder based on contrast ratio, according to an exemplary embodiment of the present invention;

FIG. 3 is a schematic diagram of a sample pair construction process provided by an exemplary embodiment of the present invention;

FIG. 4 is a diagram of the structure of an encoder and decoder provided by an exemplary embodiment of the present invention;

FIG. 5 is a schematic diagram of a proxy task and discriminator based on feature combination and decomposition according to an exemplary embodiment of the invention;

FIG. 6 is a schematic diagram of an abnormality detection device for an electric energy meter based on a contrast-resistant self-encoder according to an exemplary embodiment of the present invention;

fig. 7 is a structure of an electronic device provided in an exemplary embodiment of the present invention.

Detailed Description

Hereinafter, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein.

It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.

It will be appreciated by those of skill in the art that the terms "first," "second," etc. in embodiments of the present invention are used merely to distinguish between different steps, devices or modules, etc., and do not represent any particular technical meaning nor necessarily logical order between them.

It should also be understood that in embodiments of the present invention, "plurality" may refer to two or more, and "at least one" may refer to one, two or more.

It should also be appreciated that any component, data, or structure referred to in an embodiment of the invention may be generally understood as one or more without explicit limitation or the contrary in the context.

In addition, the term "and/or" in the present invention is merely an association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In the present invention, the character "/" generally indicates that the front and rear related objects are an or relationship.

It should also be understood that the description of the embodiments of the present invention emphasizes the differences between the embodiments, and that the same or similar features may be referred to each other, and for brevity, will not be described in detail.

Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.

The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, the techniques, methods, and apparatus should be considered part of the specification.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

Embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations with electronic devices, such as terminal devices, computer systems, servers, etc. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with the terminal device, computer system, server, or other electronic device include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments that include any of the foregoing, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment in which tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.

Exemplary method

Fig. 1 is a flowchart of a method for detecting an abnormality of an electric energy meter based on a contrast self-encoder according to an exemplary embodiment of the present invention. The embodiment can be applied to an electronic device, as shown in fig. 1, the method 100 for detecting an abnormality of a power meter based on a contrast self-encoder includes the following steps:

step 101, acquiring multi-variable long-time sequence data of historical detection of an electric energy meter to be detected;

102, carrying out normalization processing on multivariable long-time sequence data, and dividing a plurality of time window data with preset window length;

Step 103, inputting a plurality of time window data into a pre-trained anomaly detection model, and outputting reconstruction data corresponding to each time window data, wherein an anti-contrast self-encoder is adopted in the anomaly detection model;

and 104, determining the anomaly score of each time point of the time window data according to the reconstruction data of each time window data and the anomaly degree of each time point according to the anomaly score.

Specifically, the invention provides a method for detecting multivariate time series data (MTS) abnormality of an electric energy meter based on an antagonism comparison self-encoder (ACAE), which comprises the following steps:

1. description of the problem

Let T epsilon R ^M×T Is a time series of data, where T represents the length of the time series and M represents the dimension of the time series, which is an observation sequence of T time steps acquired over M observations. If m=1, the time series is a univariate time series; if M>1, the time sequence is MTS. In the present invention, the dimension represented by M is specifically: the total number of dimensions is 22, namely, an A-phase current, a B-phase current, a C-phase current, an A-phase voltage, a B-phase voltage, a C-phase voltage, a forward active power indication value, a reverse active power indication value, a forward reactive power indication value, a reverse reactive power indication value, an A-phase active power, a B-phase active power, a C-phase active power, an active power total value, an A-phase reactive power, a B-phase reactive power, a C-phase reactive power, a reactive power total value, an A-phase power factor, a B-phase power factor, a C-phase power factor and a power factor total value. The present invention is primarily concerned with detecting anomalies in MTS data. Let s be _t ∈R ^M Representing the observed value of the time series at the time T, T can be expressed as T.epsilon.s ₁ ,s ₂ ,s ₃ ,...,s _T }. The goal of MTS anomaly detection is to detect the timestamp of an anomaly in a time series, specifically to calculate an anomaly score S for each time step of the time series _t And detecting a time step with an anomaly score greater than a threshold as anomaly.

Multivariate Time Series (MTS) multivariate/multidimensional time series

Adversarial Contrastive Autoencoder (ACAE) contrast self-encoder

2. Frame overview

A frame diagram of the proposed method is shown in fig. 2. The ACAE mainly comprises modules of data preprocessing, sample pair construction, feature extraction, feature fusion, decomposition, reconstruction and the like. First, the data is normalized and divided by a sliding window. Next, in order to obtain positive and negative pairs of samples for time series contrast learning, multiple enhanced views of the samples are obtained as positive samples by a multi-scale timestamp mask, and negative samples are generated by sampling. And then, extracting the characteristics of the data by using an encoder and performing contrast learning based on the characteristic combination and decomposition agent task provided by the invention. The main flow is to combine the features of positive or negative pairs of samples first and then train the discriminator in antagonism to predict the class of the combined features and the proportion of each component. Finally, the training decoder reconstructs the input samples and calculates an anomaly score for each time step based on the reconstruction error.

3. Data preprocessing

Usually, the original multivariate time series data is subjected to normalization processing and can be used for training of a deep learning model. The standardization can eliminate the influence of unit and scale differences among features, and map the distribution of data to the vicinity of a non-saturated region of a neural network activation function, so that the training speed of a model is greatly increased, and the generalization performance of the model is improved. MTS normalization is typically achieved by appropriately shifting and scaling the original sequence.

Z-score normalization is the most widely used MTS normalization scheme that scales each dimension of each multivariate time series using the mean and standard deviation of the samples, where μ _k Sum sigma _k Global mean and standard deviation of the kth feature:

in the method, in the process of the invention,representation s _t Is the value of the kth feature of (c).

The formula for Z-Score normalization is:

First, each dimension of the normalized scaling MTS data is applied, aiming to eliminate large scale features and help the model converge faster. To capture the time correlation between time steps, a sliding window of interval d is used to obtain successive data segments as input to the model. Time window data x _i Can be expressed as x _i ＝{x _i,1 ,x _i,2 ,x _i,3 ,...,x _i,w }，x _i E RM×w, where M is the dimension of the data, w is the size of the window, and x _i,t E RM, representing x _i Observation vector at time step t.

4. Data enhancement

Creating different views of the same sample using some data enhancement methods is critical to contrast learning. However, some data transformation methods are currently used for time series contrast learning, which may destroy the mode of the original data, so that the model is difficult to learn a reliable data representation. Mask learning has been successfully applied in the fields of machine vision and natural language processing. The model can obtain stronger characterization capability by learning the visible portion of the sample and inferring the invisible portion of the sample. The invention obtains the enhanced sample of the time sequence window data by adding a mask. While the masked samples and the original samples may differ significantly at the data level, they retain the advanced semantic features of the original time series and do not introduce additional noise. By maximizing the similarity of the masked sample and the original sample features, the encoder can be given some ability to infer the invisible portion, enabling better learning of the main trend of the temporal sequence, which is critical to learning the advanced semantic features of the temporal sequence.

Although the original input may be masked directly, in order for the model to distinguish between the visible and masked portions of the time series data, the present invention will apply a timestamp mask to mask the data for some time steps of the sample after the original data is input to one projection layer. Because it is difficult to find a particular marker in the range of values of the original time series. Intuitively, input data can typically be masked with a value of 0, but the time series may contain many time steps of value 0, and the model will have difficulty distinguishing which time steps are masked. In ACAE, a projection layer is first applied to project the input vector of each time step into a vector with a higher dimension:

a _i,t ＝Wx _i,t +b

wherein x is _i,t ∈R ^M ，a _i,t ∈R ^M 'and M'>M. The data representation a of the projection space is then processed by applying a time stamp mask, in which w x p time steps are sampled along the time axis of a and set to 0, where p is the set mask probability, and the sampling process is performed independently for each time window data in each forward channel. It can be demonstrated that there is a set of parameters W, b such that any time step of the original time sequence is not 0 in projection space, so the network has the ability to distinguish which time steps are masked. Experimental results also show that better performance can be obtained by applying a time stamp mask after projecting the layer. The projection layer will perform parameter optimization together with the encoder.

The specific construction process of the positive and negative sample pairs is shown in fig. 3, a series of enhancement samples of the input data can be obtained by adjusting the shielding probability p, and enhancement samples with different shielding rates are helpful for model extraction of multi-scale information of time sequences, and the effect of contrast learning is enhanced. For a projection space representation a from the raw data, the invention obtains m enhanced views a+ thereof by setting different p, i.e. each sample has m positive sample pairs. Since there may be many "false negative samples" in the time series dataset, and in order to reduce computational overhead, the present invention does not continue the conventional contrast learning method to take all other samples in a batch as current sample negative samples for the selection of negative samples. For each original sample, the other samples of the same batch and their enhanced samples can be considered as a negative sample pool for that sample. Given the hyper-parameter n, the present invention samples n samples from the other original samples and their enhanced samples of each class, respectively, as negative samples of the original samples. Each sample thus has n× (m+1) negative sample pairs.

5. Feature extraction-encoder

After the positive and negative pairs of samples are constructed, they are all sent to a Siam encoder network, which is a residual network of three shared weights, to obtain the low dimensional hidden variables of all samples. This process can be expressed as:

z＝E(a)

Wherein E (-) represents the encoder, a is the projection representation or enhanced sample thereof obtained after the original time sequence window data is input into the projection layer, z is the hidden vector of a, namely the characteristic of the original sample or positive and negative samples, z epsilon R ^h H is the length of the hidden vector.

The 1D convolutional neural network (1D-CNN) is one of the most widely used neural networks for time-series data analysis, and thus the present invention mainly uses the 1D-CNN network to extract features. Since ResNet exhibits excellent feature extraction performance in machine vision, the present invention implements an encoder with reference to the first two residual blocks of ResNet50, the details of which are shown on the left side of FIG. 4. Where (/ 2) in some convolutional layers indicates that when multiple identical residual blocks are stacked, only the first residual block halves the time dimension of the input data, and the subsequent residual blocks do not change the length of the time sequence.

The Siamese encoder is utilized to extract hidden vector representations of each training sample and the corresponding positive and negative samples respectively, and in the next section, how ACAE performs contrast learning by utilizing the extracted positive and negative sample pair features is introduced, so that the capability of the encoder for learning the time sequence data advanced semantic features is improved.

6. Feature combination and decomposition

The individual discrimination is the mainstream agent task of the current contrast learning, which directly calculates the similarity between the features by using InfoNCE and its variant loss function, and maximizes the similarity between the features of different views of the same sample, and minimizes the similarity between the features of different samples. The time sequence anomaly detection task only trains a model by using normal data, and is essentially the same type of data. Loss of InfoNCE may result in the features of the sample being too dispersed in hidden space, resulting in loss of some of the valid information. Therefore, the existing contrast learning framework is used for time sequence data, so that robust data representation is difficult to obtain, and the downstream anomaly detection task is not facilitated.

The invention provides a proxy task based on feature combination and decomposition. The positive sample pair or the negative sample pair is combined to generate positive mixed characteristics or negative mixed characteristics, and then the discriminator is trained to decompose the mixed characteristics. If the discriminator can correctly predict the components of the negative mixed features and hardly predict the components of the positive mixed features, a certain difference exists between the features of the negative sample pair, and the features of the positive sample pair have higher similarity, so that contrast learning is realized. The proxy task proposed by the invention comprises two parts, namely feature combination and feature decomposition.

And (5) feature combination. After obtaining the characteristics of the original training sample and the positive and negative comparison samples thereof, the ACAE combines the characteristics of the positive sample pair or the characteristics of the negative sample pair in a certain proportion to obtain positive mixed characteristics and negative mixed characteristics. As shown in FIG. 5, z is a hidden vector representation of a training sample, and z ⁺ ，z ^- Respectively its hidden vector representation of a positive and negative sample. Respectively z and z ⁺ And z ^- The combination constructs positive and negative mixing features, the process is as follows:

where α and β are the combined parameters that determine the contribution of each old feature in the blended feature, and in each forward channel, when each blended feature is generated for each training sample, α and β are sampled individually from a uniform distribution with minimum and maximum values of 0 and 1, respectively, i.e., α -U (0, 1), β -U (0, 1).

And (5) characteristic decomposition. ACAE uses a multi-layer perceptron to implement a discriminator as shown in FIG. 5, which is used to separate the pre-combination components from the blended features. Specifically, the original feature of the sample and a mixed feature generated by the original feature are sent to a discriminator, and a vector with the length of 2 can be finally obtained through a multi-layer perceptron as a result of feature decomposition. The task of the discriminator has two targets, namely, correctly classifying the classes of the mixed features and correctly predicting the proportion of the original features in the mixed features, namely, generating the combination parameters (alpha or beta) of the mixed features. The first bit of the output vector is used to represent the discriminator's prediction of the hybrid feature class. The blended feature is represented by 1 as a positive blended feature and 0 as a negative blended feature. The second bit of the output vector is used for representing the prediction of the proportion of the mixed characteristic component by the discriminator, and the standard value of the prediction is alpha or beta. The invention uses the mean square error loss to constrain the difference between the predicted vector and the true value of the discriminator, so the loss function of the discriminator is as follows:

Where N is the number of training windows,and->Representing the decomposition result of the discriminator on the jth positive hybrid feature and the kth negative hybrid feature of the ith sample, respectively, l _d ⁺ And l ^- Is a real tag, wherein l _d ⁺ ＝(1,α)，l ^- = (0, β), m is the positive number of samples for one sample, n+n×m is the negative number of samples, α and β are the combined parameters that determine the contribution of each old feature in the blended feature, β=1- α.

If the discriminator can correctly decompose the blended feature, there is some difference in the two components that make up the blended feature. At the same time according toThe similarity between different view features of a sample should be maximized in comparison to the idea of learning. When a sample and its features z and z of an augmented sample ⁺ Is consistent, then the hybrid features z obtained at any alpha _c ⁺ Consistent with z, the discriminator predicts the value of a as 1, i.e. the blended features are all from the original sample. According to this idea, the encoder will train under the direction of the discriminator with the aim that the negative hybrid features can be decomposed correctly, whereas the positive hybrid features cannot be decomposed better. The idea of generating a challenge is introduced in the task of the discriminator's prediction of the combined parameter α of the positive hybrid features, so that the encoder generates positive sample pair features that are as similar as possible to confuse the discriminator. The loss function of the generator is as follows:

In the method, in the process of the invention,and->Representing the decomposition result of the discriminator on the jth positive hybrid feature and the kth negative hybrid feature of the ith sample, respectively, l _E ⁺ And l ^- Is a real tag, wherein l _E ⁺ = (1, 1) is the difference between the encoder and the discriminator loss function calculation, and is also the embodiment of the countermeasure idea. Alternately training the discriminator and encoder will constrain the encoder to extract as similar hidden vectors as possible from different views of the same sample and to extract dissimilar hidden vectors from different samples, thereby learning a robust data representation.

The comparison learning framework based on feature combination and decomposition provided by the invention is consistent with the basic idea of individual discrimination proxy tasks, but the invention adopts a discrimination mode to replace the similarity between direct calculation features, does not restrict the distribution of the features, and is more suitable for time sequence abnormality detection tasks.

7. Reconstruction and anomaly detection

The training task of contrast learning helps the encoder learn the overall trend and high-level semantic features of the time series, while the reconstruction task may focus the model on fine-grained features. Furthermore, the two-stage task may lead to sub-optimal performance of the model. Therefore, the ACAE performs training of the contrast learning task and the reconstruction task simultaneously. The present invention implements a decoder that is approximately symmetrical to the encoder structure, the main objective of which is to reconstruct the input data from the hidden vectors of the samples, the structure of which is shown on the right side of fig. 3. Let D (·) represent the decoder, this process can be expressed as:

x'＝D(z)

Wherein x' ∈R ^M×w Is the reconstructed data for sample x. The loss of reconstruction can be expressed as:

where N is the number of training windows, w is the window size, andand x _i,t The reconstructed data and the original data at time step t in the ith window, respectively. The overall training loss of the proposed method can be expressed as the sum of the reconstruction loss and the contrast loss:

L＝L _recon +λ ₁ L _d +λ ₂ L _E

wherein lambda is ₁ And lambda (lambda) ₂ Is a hyper-parameter that controls the relative importance of each loss. In the invention, an alternate optimization method is adopted in the model optimization stage, and the training process of the proposed method is summarized in algorithm 1.

ACAE is optimized using a training set that is mostly normal data to learn the normal pattern of the data. After training is completed, the encoder and decoder are retained and the discriminator module is discarded. In the test phase, as with many other reconstruction-based methods, the ACAE uses an encoder and a decoder to reconstruct the test samples and calculates their anomaly scores from the reconstruction errors of each time-step sample, namely:

wherein S is _i,t Is the anomaly score for time step t of the ith test data. The selection of the threshold depends on the application scenario, and there are many studies to dynamically configure the threshold according to the anomaly score. The present invention is directed to a framework designed for learning advanced semantic features of data and anomaly detection. The experimental results reported by the present invention are based on the highest scoring threshold, as done by the previous work.

Pseudo code for an implementation of the present application is shown in table 1.

TABLE 1

In addition, the application verifies the validity of the scheme through specific data, and the method is specifically as follows:

the effectiveness and the advancement of the provided electric energy meter multidimensional time sequence anomaly detection method based on the contrast comparison self-encoder are proved by performing comparison experiments on 5 authoritative real world data set data sets representing time sequence data distribution diversity and intelligent electric meter actual data sets on ACAE and 14 advanced models.

Evaluation index: AUC, fc1 and PA% K were chosen as evaluation indicators to evaluate the performance of the proposed method and baseline.

Area Under Curve (AUC). AUC is one of the most popular indicators for assessing unsupervised anomaly detection tasks, which is the area formed by the subject's operating curve (ROC) and the coordinate axis. AUC can directly reflect the quality of the ordering of the algorithm for the abnormal score of the test sample, excluding the impact of the threshold. The AUC value ranges from 0 to 1. A perfect ranking would result in an AUC equal to 1, whereas the average model of random guesses has an AUC value close to 0.5.

Composite F-score (Fc 1). Fc1 is an indicator proposed in the recent literature that focuses on the ability of the model to detect abnormal events and avoids the problem of overestimation of model performance due to point adjustment strategies. The calculation mode of the recall ratio in the original F1 fraction is changed into the mode of calculating the recall ratio of the abnormal section, and the original point-by-point accuracy rate calculation mode is maintained. A model with higher abnormal segment recall and fewer false positives at normal time steps will achieve a higher Fc1 score.

PointAdjust% K (PA% K). Also for the problem of model performance overestimation caused by point adjustment strategies, the recent literature proposes PA% K. It also calculates the F1 score for all time steps, but applies a point adjustment strategy to adjust the predicted value only if the proportion of the entire anomaly segment length for the anomaly step detected by the model in a certain continuous anomaly segment exceeds K percent. The dependence of the model on K was then reduced by adjusting K and calculating the area under the curve of PA% K.

The comparison method comprises the following steps:

the proposed ACAE method was compared with 14 baseline methods as shown below. Where LOF, OSVM, ifest is a classical machine-learning based anomaly detection method, while others are all recently popular deep-learning based time-series anomaly detection algorithms.

LOF. A method for anomaly detection by calculating the local density deviation of a given data point relative to its neighborhood.

OCSVM. A method of mapping data samples to a high-dimensional feature space by a kernel function and partitioning positive anomaly boundaries.

iForest. An integrated model that isolates anomalies by randomly selecting features and randomly segmenting observations.

MSCRED. A characteristic diagram of different scales of a sample is obtained based on a convolutional neural network, and a model reconstructed at multiple scales is used by using a convolutional long short-time memory network based on attention.

BeatGAN. A model based on an antagonistic self-encoder structure adds a discriminator to the original self-encoder structure for improving the authenticity of self-encoder reconstruction.

USAD. An anomaly detection method based on two automatic encoders trained in a antagonistic manner to reconstruct data. The reconstruction errors of the two automatic encoders are used to calculate the anomaly score.

UAE. A simple full-link self-encoder based model was proposed along with Fc1 and gave good results in the Fc1 index.

Interfusion. A reconstruction model based on two hierarchical variational self-encoders models the inter-metric and temporal correlations of MTS, respectively.

And GDN. A model based on the mechanism of attention and the learning of the structure of MTS by the neural network of the graph and the prediction of future values, and the predicted error is used to detect abnormal values.

GTA. A predictive model for mining MTS features in combination with a neural network and a transducer also uses the error between the predicted and observed values to calculate anomaly scores.

TranAD. A depth transducer based reconstruction model that uses self-regulation and resistance training to amplify errors and obtain stability performance.

And (5) AT. One based on transducer modeling and jointly detects anomalies by reconstructing errors and correlation differences between sequences.

TimeCLR. A contrast learning framework for time series representation uses dynamic time warping for data enhancement and receptionTime for feature extraction.

Cae_ad. An end-to-end self-encoder that combines contrast learning, which performs both context contrast and instance contrast.

Implementation details:

ACAE was implemented based on PyTorch. In the data enhancement phase, 4 enhancement views are generated for each training data using four temporal masks with mask rates of 0.05,0.15,0.3,0.5, respectively, and an optimal number of negative samples n is searched for each data set from 4,8, 12, 16, 20. Dimension M' and hidden of projection layerThe dimension h of the vector is set to 128; and applying 0.5 dropout in the first two layers of the discriminator; lambda (lambda) ₁ And lambda (lambda) ₂ All set to 1. The length w of the sliding window is set to 128, and the step length is set to 8; the data batch size is set to 128; and training data according to 8: the proportion of 2 is divided into a training set and a verification set, the maximum training round is set to be 200, but training is stopped in advance when the reconstruction loss of the verification set is continuously reduced by 5 rounds, and a model with the lowest reconstruction loss on the verification set is reserved. ACAE was trained using Adam optimizer, learning rate was set to 1e-4, and weight decay of 1e-4 was applied, all experiments were repeated 5 times under different random seeds and average results were reported.

Introduction of the public dataset:

5 real world datasets from three scenes were used and table 2 summarizes the attributes of these datasets.

Secure Water Treatment (SWaT). SWaT is data collected by 51 sensors of a continuously operating water treatment system in which anomalies caused by network and physical attacks are recorded.

Server Machine Dataset (SMD). SMD is a data set collected and published by a large internet company for 5 weeks from a server machine with 38 monitoring metrics.

PSM (Pooled Server Metrics). PSM is a data set collected from within multiple application server nodes of eBay, for a total of 26 dimensions.

Mars Science Laboratory (MSL) dataset and Soil Moisture Active Passive (SMAP) dataset. Both the MSL and SMAP datasets are real world datasets from NASA, having 55 and 25 dimensions, respectively, containing telemetry anomaly data derived from incident anomaly (Incident Surprise Anomaly, ISA) reports of spacecraft monitoring systems.

Table 2 discloses characteristics of data set

Public data set outcome evaluation:

table 3 reports AUC, fc1 and PA% K scores for ACAE and other baseline models. We represent the best score on each dataset with bold fonts and underline the suboptimal score on each dataset. Overall, ACAE achieved the highest score on all 3 indicators and was significantly higher than the other baselines, indicating that ACAE performance was better than all baseline models. The cae_ad achieved suboptimal Fc1 score and PA% K score, again demonstrating that the reconstructed model combined with contrast learning can better model the normal pattern of the data. Whereas iferst achieves a suboptimal AUC score, demonstrating that it ranks better for each time step anomaly score, but because it cannot take into account time correlation, it may not detect consecutive anomalies. Specifically, on the SWAT dataset, the Fc1 score of ACAE was much higher than the other baseline, while both AUC and PA% K scores were slightly less than optimal. On the SMD and PSM data sets, although the AUC score of ACAE is at a higher level, the F1c and PA% K scores are at a distance from the optimal score. On the MSL dataset, the AUC score for ACAE was far higher than for all baseline methods, fc1 and PA% K scores were slightly less than optimal. And on average over the SMAP dataset.

TABLE 3 Baseline method and AUC, fc1 and PA% K of ACAE

Introduction of ammeter data set:

specific data characteristics of the smart meter dataset (ELE) are shown in table 4. The data set is collected from 9 three-phase ammeter entity devices of a plurality of bays, each device comprising 22 sensor values of current (a phase, B phase, C phase), voltage (a phase, B phase, C phase), power indication (forward active), power indication (reverse active), power indication (forward reactive), power indication (reverse reactive), active power (a phase, B phase, C phase, total value), reactive power (a phase, B phase, C phase, total value), power factor (a phase, B phase, C phase, total value).

Table 4 characteristics of actual ammeter dataset

The three-phase ammeter entity devices have various types of anomalies such as reverse tide, overcurrent, current loss, reverse running of the ammeter, flying of the ammeter, uneven ammeter representation value, stop running of the ammeter, abnormal reverse electric quantity and the like during respective data recording. The data set comprises continuous 9-16 month data collected by each ammeter entity device according to 96 points of daily sampling, wherein the data set comprises normal ammeter measurement data and abnormal ammeter data. The experiment uses data intervals containing only normal data for training and uses data containing anomalies for testing.

The actual electricity meter data set results are evaluated as shown in table 5:

the ACAE algorithm and baseline method were developed on the smart meter dataset to give the results shown in table 5. From the results, it can be observed that the ACAE achieved the highest score under all three evaluation indexes, so it can be considered that the abnormal detection performance of the ACAE on the actual meter data level is significantly consistent with the results exhibited on the public data set due to the comparative baseline method.

TABLE 5

Exemplary apparatus

Fig. 6 is a schematic structural diagram of an abnormality detection device for an electric energy meter based on a contrast self-encoder according to an exemplary embodiment of the present invention. As shown in fig. 6, the apparatus 600 includes:

the acquisition module 610 is configured to acquire multi-variable long-time sequence data of historical detection of the electric energy meter to be detected;

the dividing module 620 is configured to normalize the multivariate long-time sequence data and divide a plurality of time window data with a preset window length;

the output module 630 is configured to input the plurality of time window data into a pre-trained anomaly detection model, and output reconstructed data corresponding to each time window data, where an countermeasure against self-encoder is adopted in the anomaly detection model;

a determining module 640, configured to determine an anomaly score for each time point of the time window data according to the reconstruction data of each time window data and determine an anomaly degree for each time point according to the anomaly score.

Optionally, after obtaining the multivariate long-time series data of the historical detection of the electric energy meter to be measured, the apparatus 600 further comprises:

a normalization module for normalizing all values of each variable in the multivariate long time series data to a standard normal distribution using Z-Score normalization, wherein

The formula for Z-Score normalization is:

wherein X is _i For the multi-variable long-time series data,representing normalized X _i Mu represents X _i Mean value of all sample data for each variable, σ represents X _i Standard deviation of all sample data for each variable in (a).

Optionally, the training process of the anomaly detection model in the output module 630 is as follows:

the acquisition submodule is used for acquiring multiple variable time series data samples of historical detection of the electric energy meters and combining the multiple variable time series data samples into one variable long time series data sample;

a normalization sub-module for normalizing all values of each variable in the multivariate long time series data samples to a standard normal distribution using Z-Score;

the windowing submodule is used for windowing the standardized multivariable long-time sequence data samples and dividing the standardized multivariable long-time sequence data samples into a plurality of time window data samples of a preset window;

the first determining submodule is used for inputting a plurality of time window data samples into the projection layer and then applying a time stamp mask to carry out data enhancement to determine positive and negative sample pairs;

The second determining submodule is used for inputting the positive and negative sample pairs into the encoder to perform feature extraction and determining positive and negative sample pair features;

and the third determination submodule is used for performing model training on the characteristics, the encoder, the discriminator and the decoder by utilizing the positive and negative samples and determining an abnormality detection model.

Optionally, the first determining submodule includes:

the first acquisition unit is used for inputting the time window data samples into the projection layer and acquiring projection representation of each sample;

a second acquisition unit for acquiring positive and negative samples of each time window data sample using the multi-scale timestamp mask and random sampling;

a first determining unit configured to set a plurality of enhanced samples of one sample obtained with different mask probabilities in a timestamp mask, determine a positive sample pair;

and a second determining unit, configured to extract a predetermined number of samples from the other original time window data samples and the enhanced samples of each class, respectively, as negative samples of the original time window data, and determine a negative sample pair of each sample.

z＝E(a)

Where E (-) represents the encoder, z is the hidden vector, i.e., the positive and negative sample pair characteristics, z ε Rh, h is the length of the hidden vector, and a is the projected representation obtained after the original time series window data is input to the projection layer.

Optionally, the third determining sub-module comprises:

the third determining unit is used for carrying out characteristic recombination on the positive and negative sample pair characteristics and determining positive and negative sample pair combination characteristics;

the third acquisition unit is used for inputting the positive and negative sample pair characteristics and the positive and negative sample pair combined characteristics into the discriminator to acquire a prediction result and discriminator loss;

a first updating unit for updating the discriminator to minimize the discriminator loss, and acquiring the prediction result again;

a fourth determining unit for inputting the prediction result to the encoder, and determining the encoding result and the encoding loss;

the second updating unit is used for updating the encoder, minimizing the encoding loss and acquiring the encoding result again;

a fifth determining unit for determining a reconstruction loss according to the positive and negative sample pair characteristics reconstructed by the encoder and the decoder;

and a sixth determining unit for updating the encoder and decoder, minimizing reconstruction loss, and determining an anomaly detection model.

Where α and β are the combined parameters that determine the contribution of each old feature in the blended feature, z is the hidden vector representation of a training sample, and z ⁺ ，z ^- A hidden vector representation of a positive and negative sample, z _c ⁺ ，z _c ^- Positive and negative combined features, respectively;

loss function L of discriminator _d The method comprises the following steps:

wherein N is the number of training windows, p _ij And p _ik Representing the decomposition result of the discriminator on the jth positive hybrid feature and the kth negative hybrid feature of the ith sample, respectively, l _d ⁺ And l ^- Is a real tag, wherein l ⁺ ＝(1，α)，l ^- = (0, β), m is the positive number of samples for one sample, n+n×m is the negative number of samples, α and β are the combined parameters β=1- α that determine each old feature contribution in the blended feature;

loss function L of encoder _E The method comprises the following steps:

wherein p is _ij And p _ik Representing the decomposition result of the discriminator on the jth positive hybrid feature and the kth negative hybrid feature of the ith sample, respectively, l _E ⁺ And l ^- Is a real tag, wherein l _E ⁺ ＝(1，1)，l ^- = (0, β), m is the positive number of samples for one sample, n+n×m is the negative number of samples;

Alternatively, the calculation formula of the anomaly score is:

Exemplary electronic device

Fig. 7 is a structure of an electronic device provided in an exemplary embodiment of the present invention. As shown in fig. 7, the electronic device 70 includes one or more processors 71 and memory 72.

The processor 71 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.

Memory 72 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 71 to implement the methods of the software programs of the various embodiments of the present invention described above and/or other desired functions. In one example, the electronic device may further include: an input device 73 and an output device 74, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).

In addition, the input device 73 may also include, for example, a keyboard, a mouse, and the like.

The output device 74 can output various information to the outside. The output device 74 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.

Of course, only some of the components of the electronic device relevant to the present invention are shown in fig. 7 for simplicity, components such as buses, input/output interfaces, etc. being omitted. In addition, the electronic device may include any other suitable components depending on the particular application.

Exemplary computer program product and computer readable storage Medium

In addition to the methods and apparatus described above, embodiments of the invention may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in a method according to various embodiments of the invention described in the "exemplary methods" section of this specification.

The computer program product may write program code for performing operations of embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the invention may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform steps in a method according to various embodiments of the invention described in the "exemplary method" section of the description above.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present invention have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present invention are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present invention. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the invention is not necessarily limited to practice with the above described specific details.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, so that the same or similar parts between the embodiments are mutually referred to. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.

The block diagrams of the devices, systems, apparatuses, systems according to the present invention are merely illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, systems, apparatuses, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

The method and system of the present invention may be implemented in a number of ways. For example, the methods and systems of the present invention may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present invention are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present invention may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present invention. Thus, the present invention also covers a recording medium storing a program for executing the method according to the present invention.

It is also noted that in the systems, devices and methods of the present invention, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention. The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the invention to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. The utility model provides a method for detecting electric energy meter abnormality based on contrast self-encoder, which is characterized by comprising the following steps:

inputting a plurality of time window data into a pre-trained anomaly detection model, and outputting reconstruction data corresponding to each time window data, wherein a contrast self-encoder is adopted in the anomaly detection model;

determining the abnormality score of each time point of the time window data according to the reconstruction data of each time window data and the time window data, and determining the abnormality degree of each time point according to the abnormality score.

2. The method of claim 1, wherein the multivariate long-time series data comprises: phase a current, phase B current, phase C current, phase a voltage, phase B voltage, phase C voltage, forward active power indication, reverse active power indication, forward reactive power indication, reverse reactive power indication, phase a active power, phase B active power, phase C active power, active power total, phase a reactive power, phase B reactive power, phase C reactive power, reactive power total, phase a power factor, phase B power factor, phase C power factor, and power factor total.

3. The method of claim 1, further comprising, after obtaining the multivariate long time series data of the historical test of the electrical energy meter under test:

normalizing all values of each variable in the multivariate long time series data to a normalized distribution using Z-Score normalization, wherein

The formula of the Z-Score standardization is as follows:

4. The method of claim 1, wherein the training process of the anomaly detection model is as follows:

normalizing all values of each variable in the multivariate long-time series data sample to a standard normal distribution using Z-Score;

windowing the standardized multivariable long-time sequence data samples, and dividing the standardized multivariable long-time sequence data samples into a plurality of time window data samples of the preset window;

inputting a plurality of time window data samples to a projection layer, then applying a time stamp mask to carry out data enhancement, and determining positive and negative sample pairs;

Inputting the positive and negative sample pairs to an encoder for feature extraction, and determining positive and negative sample pair features;

and performing model training by using the positive and negative sample pair characteristics, the encoder, the discriminator and the decoder to determine the abnormality detection model.

5. The method of claim 4, wherein the step of applying a timestamp mask for data enhancement after inputting a plurality of the time window data samples to the projection layer, determining positive and negative sample pairs, comprises:

inputting the time window data samples to the projection layer to obtain a projection representation of each sample;

6. The method of claim 4, wherein the step of inputting the positive and negative pairs of samples to an encoder for feature extraction and determining the positive and negative pairs of samples is represented by:

z＝E(a)

7. The method of claim 4, wherein model training using the positive and negative sample pairs features, the encoder, discriminator, and decoder to determine the anomaly detection model comprises:

inputting the positive and negative sample pair characteristics and the positive and negative sample pair combined characteristics to the discriminator to obtain a prediction result and discriminator loss;

updating the discriminator to minimize the discriminator loss, and acquiring a prediction result again;

inputting the prediction result to the encoder, and determining an encoding result and an encoding loss;

updating the encoder, minimizing the encoding loss, and acquiring an encoding result again;

reconstructing the positive and negative sample pair characteristics according to the encoder and the decoder, and determining reconstruction loss;

updating the encoder and the decoder, minimizing the reconstruction loss, and determining the anomaly detection model.

8. The method according to claim 7, wherein the process of feature reorganizing the positive and negative samples is:

loss function L of the discriminator _d The method comprises the following steps:

loss function L of the encoder _E The method comprises the following steps:

the reconstruction loss function L _recon The method comprises the following steps:

where N is the number of training windows, w is the window size, And x _i,t The reconstructed data and the original data at time step t in the ith window, respectively.

9. The method of claim 1, wherein the anomaly score is calculated by the formula:

10. An electric energy meter abnormality detection device based on a contrast self-encoder, which is characterized by comprising:

the output module is used for inputting a plurality of time window data into a pre-trained anomaly detection model and outputting reconstruction data corresponding to each time window data, wherein an antagonism contrast self-encoder is adopted in the anomaly detection model;

the determining module is used for determining the abnormality score of each time point of the time window data according to the reconstruction data of each time window data and the time window data, and determining the abnormality degree of each time point according to the abnormality score.