CN117851920A - Power Internet of things data anomaly detection method and system - Google Patents

Power Internet of things data anomaly detection method and system Download PDF

Info

Publication number
CN117851920A
CN117851920A CN202410256677.5A CN202410256677A CN117851920A CN 117851920 A CN117851920 A CN 117851920A CN 202410256677 A CN202410256677 A CN 202410256677A CN 117851920 A CN117851920 A CN 117851920A
Authority
CN
China
Prior art keywords
data
time
layer
sequence
abnormality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410256677.5A
Other languages
Chinese (zh)
Inventor
孙岗
赵鹏
严莉
曲延盛
常英贤
呼海林
王高洲
杨坤
牛德玲
邵志敏
樊静雨
胡恒瑞
管荑
梁天
王中龙
朱尤祥
肖沈阳
周洁
孟祥鹿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Shandong Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Shandong Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Shandong Electric Power Co Ltd, Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Shandong Electric Power Co Ltd
Priority to CN202410256677.5A priority Critical patent/CN117851920A/en
Publication of CN117851920A publication Critical patent/CN117851920A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of power data anomaly detection, in particular to a power internet of things data anomaly detection method and a system, which utilize stacked discrete wavelet transformation to decompose original power data, and input the decomposed data into a space-time network model, so that complex correlations between time sequence characteristics and sequences can be simultaneously mined. In the training process, the data slice is used as an input training anomaly detection model, finally, the data to be detected is preprocessed and then is input into the anomaly detection model, anomaly scores are calculated with real data, whether the scores exceed a threshold value or not is judged, and the situation that the scores exceed the threshold value is abnormal is judged. By using discrete wavelet transformation, a space-time network and a variation self-coding method, time series data can be better represented, so that the accuracy of anomaly identification is improved.

Description

Power Internet of things data anomaly detection method and system
Technical Field
The invention relates to the technical field of power data anomaly detection, in particular to a power internet of things data anomaly detection method and a system.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In the power system, cross-domain and cross-system power internet of things data are continuously increased, and the multi-source data have high time sequence and complex relevance, and although important information can be provided for the operation and management of the power system, the abnormal detection of the part is required to ensure the stable operation of the whole power internet of things system. Time-series anomaly detection methods can be classified into supervised, semi-supervised and unsupervised methods according to the presence or absence of a tag, wherein the tag acquisition cost of the supervised method is high, so most of detection is focused on the unsupervised method without using a tag.
From the perspective of the method, the traditional anomaly detection method is difficult to deal with the characteristics of the cross-domain and cross-system electric power internet of things data. They tend to model and analyze based on a single system or domain-specific data, ignoring the correlation and timing characteristics between the data, resulting in non-ideal results of the detection.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides the method and the system for detecting the abnormality of the electric power Internet of things data, which can better process data of cross-domain and cross-system, mine time sequence characteristics and complex association, realize efficient identification and analysis of abnormal data and ensure the reliability and the stability of an electric power system.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
the first aspect of the invention provides a method for detecting abnormality of electric power internet of things data, comprising the following steps:
acquiring and preprocessing power time sequence data, decomposing to obtain frequency components, corresponding channel numbers, sequence numbers and predicted time window lengths, and forming a data set;
the obtained data set is based on the trained abnormality detection model, the characteristic relation between the long-distance time dependence of the time sequence and the multiple sequences is respectively captured, and the predicted value is obtained by reconstruction according to the time sequence data;
and determining an abnormality score by using the obtained predicted value through a multi-element Gaussian distribution, taking the mahalanobis distance as a measurement standard of the abnormality score, and taking the corresponding data point as an abnormality when the abnormality score exceeds a threshold value.
Further, the preprocessing includes acquiring N power time series, and cleaning a part of noise data as an input of training.
Further, frequency components are obtained through a stacked discrete wavelet transform method, a data set (C, T, N) is constructed, wherein C is the number of sequences obtained after the discrete wavelet decomposition corresponding to the number of channels, T is the length of a predicted time window, and N is the number of multiple time sequences, and the following formula is shown:
where x represents the multi-sequence data representing the original input, w represents the discrete wavelet basis function,representing coefficients obtained via stacked discrete wavelet transforms, < >>Representing a high pass filter->At time->Is a value of (a).
Furthermore, the anomaly detection model comprises a feature processing layer, a space-time network module and a prediction layer, wherein the feature processing layer comprises a preprocessing layer and a convolution network, the space-time network module is provided with a plurality of groups, each group of space-time network module is formed by connecting a double-layer graph convolution layer with a transform model in parallel, the prediction layer is reconstructed through the convolution layer and a variation self-encoder, the convolution layer and the linear layer respectively reduce the feature dimension, the variation self-encoder reconstruct the sequence and the time dimension, and a prediction result is output.
Further, the spatio-temporal network module includes a spatio-temporal position embedding layer, a picture convolution layer and a transform, and reconstructs time-series data by encoding a time dependency in each time series and correlations between different time series pairs.
Further, a temporal length is injected into the input sequence using a spatio-temporal position embedding layer, and a position vector is constructed based on the set function.
Further, the graph convolution layer acquires the adjacency matrix and the sequence input, and constructs a filter in the Fourier domain, and the filter constructs a graph convolution model by overlapping a plurality of convolution layers through spatial features between the first-order neighborhood capture nodes of the filter.
Further, the relationship between the time hidden information is obtained by detecting multivariate time sequence data modeled by the time dimension and the feature dimension respectively through multiple attentions.
Further, the mahalanobis distance is used as a measure of the anomaly score, as shown in the following formula:
wherein,feature vector representing data point +_>Mean vector representing multiple nodes ++>Covariance matrix representing node vector, +.>Representing data points to mean vector +.>Is a mahalanobis distance.
The second aspect of the invention provides a system for detecting abnormality of electric power internet of things data, comprising the following steps:
a data acquisition module configured to: acquiring and preprocessing power time sequence data, decomposing to obtain frequency components, corresponding channel numbers, sequence numbers and predicted time window lengths, and forming a data set;
a feature mining module configured to: the obtained data set is based on the trained abnormality detection model, the characteristic relation between the long-distance time dependence of the time sequence and the multiple sequences is respectively captured, and the predicted value is obtained by reconstruction according to the time sequence data;
an anomaly detection module configured to: and determining an abnormality score by using the obtained predicted value through a multi-element Gaussian distribution, taking the mahalanobis distance as a measurement standard of the abnormality score, and taking the corresponding data point as an abnormality when the abnormality score exceeds a threshold value.
A third aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described method for detecting anomalies in electrical distribution force internet of things data.
A fourth aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the above method for detecting anomalies in electrical data on a physical network when executing the program.
Compared with the prior art, the above technical scheme has the following beneficial effects:
the method is matched with an anomaly detection model formed by a space-time diagram network, captures the anomaly modes among multiple sequences, fuses the prediction task and the reconstruction task together to better represent time sequence data, thereby reducing false alarm rate, reducing safety risk, better processing cross-domain and cross-system data, realizing efficient identification and analysis of anomaly data by mining time sequence characteristics and complex association, and guaranteeing the reliability and stability of a power system.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a schematic diagram of an anomaly detection flow provided by one or more embodiments of the present invention;
FIG. 2 is a schematic diagram of an anomaly detection model provided by one or more embodiments of the present invention;
FIG. 3 is a schematic representation of the original subsequence in the stacked discrete wavelet decomposition effect provided by one or more embodiments of the present invention;
FIG. 4 is a wavelet sub-sequence diagram of Level1 in a stacked discrete wavelet decomposition effect provided by one or more embodiments of the present invention;
FIG. 5 is a wavelet sub-sequence diagram of Level2 in a stacked discrete wavelet decomposition effect provided by one or more embodiments of the present invention;
FIG. 6 is a wavelet sub-sequence diagram of Level3 in a stacked discrete wavelet decomposition effect provided by one or more embodiments of the present invention;
FIG. 7 is a wavelet sub-sequence diagram of Level4 in a stacked discrete wavelet decomposition effect provided by one or more embodiments of the present invention;
FIG. 8 is a graph illustrating anomaly detection results provided by one or more embodiments of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
As described in the background art, the conventional anomaly detection method often has difficulty in coping with the characteristics of the cross-domain and cross-system electric power internet of things data. They tend to model and analyze based on a single system or domain-specific data, ignoring the correlation and timing characteristics between the data, making the detection result non-ideal.
Therefore, the following embodiments provide a method and a system for detecting anomaly of electric power internet of things data, which decompose original electric power data by using stacked discrete wavelet transform, and input the decomposed data into a space-time network model. The space-time network model consists of a feature processing layer, a space-time network module and a prediction layer, wherein the feature processing layer utilizes different convolution layers to further extract local and global feature information. The space-time network module combines a double-layer graph rolling network and a transducer model, and adopts a residual connection technology, so that complex correlations between time sequence characteristics and sequences can be simultaneously mined. The prediction layer realizes the unification of the prediction task and the reconstruction task, reduces the characteristic dimension through the convolution layer, and then reconstructs the sequence and the time dimension through the variation self-encoder. In the training process, the data slice is used as an input training anomaly detection model, finally, the data to be detected is preprocessed and then is input into the anomaly detection model, anomaly scores are calculated with real data, whether the scores exceed a threshold value or not is judged, and the situation that the scores exceed the threshold value is abnormal is judged. By using discrete wavelet transformation, a space-time network and a variation self-coding method, time series data can be better represented, so that the accuracy of anomaly identification is improved.
Embodiment one:
as shown in fig. 1 to 8, the method for detecting the abnormality of the electric power internet of things data comprises the following steps:
1) Data preprocessing: acquiring N power time sequences and performing preprocessing operation; preprocessing electric power time sequence data, removing unimportant characteristics in the data, cleaning partial noise data, and taking the result of the data preprocessing as the input of the next model training; decomposition of the original sequences by stacked discrete wavelet methods, each of which will generate C-1 detail coefficients (higher frequencies) and 1 trend sequence, thereby producing a dataset of C N dimensionsXd
In this embodiment, the power time sequence is distributed photovoltaic power data, belongs to the category of data collected by the internet of things in the power grid, and covers power information of N photovoltaic stations. Since anomalies in photovoltaic data may be affected by numerous factors, including but not limited to climate, equipment status, etc., this embodiment selects this complex and diverse source of data for anomaly detection. The method aims at showing that the method is applicable to a wider electric power combined detection scene by aiming at good anomaly detection of a multi-source complexity scene of the photovoltaic power data.
2) Model design: and constructing an adjacency matrix according to the correlation coefficient and prior experience, designing an anomaly detection model based on multiple time sequences of a variation self-encoder and a space-time diagram network model, wherein the model comprises a characteristic processing layer, a space-time network module and a prediction layer, the characteristic processing layer comprises a preprocessing layer and a convolution network, the single space-time network module is formed by connecting a double-layer GCN (generalized joint network) with the transform model in parallel, the prediction layer is reconstructed through the convolution layer and the variation self-encoder (VAE), the convolution layer and the linear layer respectively reduce the characteristic dimension, the VAE reconstruct the sequence and the time dimension, and a prediction result is output.
3) Model training: after data preprocessing, the time series data of the N sensors are set to be the predicted time window length T1, the output time dimension is T2, the data are cut into small samples, the number of single training samples is set to be B, the single input dimension is constructed to be (B, C, T1, N), the abnormal detection model is trained by taking the samples standardized by the original sequence as output, and similarly, the standardized output dimension (B, T2, N) is constructed.
4) Abnormality scoring: after training the model, preprocessing the data to be detected, inputting the data into an anomaly detection model, outputting a sample prediction task, and inputting the output data and the real data into an anomaly scoring model to obtain anomaly scores; based on the training set model evaluation method, an abnormality detection threshold is set, whether the abnormality score exceeds the threshold is judged, if yes, the abnormality is judged, and otherwise, the abnormality is considered as normal.
As shown in FIG. 1, the overall flow chart of the anomaly detection task uses discrete wavelet basis functions in the preprocessing stagewBy a high-pass filterAnd a low-pass filter->The method comprises the steps of obtaining frequency components through stacking discrete wavelet transformation MODET, and constructing (C, T, N) three-dimensional data, wherein C is the number of sequences obtained after discrete wavelet decomposition corresponding to the number of channels, T is the length of a predicted time window, and N is the number of multiple time sequences. The formula can be expressed as: />
Wherein,xrepresenting multi-sequence data representing the original input,wrepresenting the basis functions of the discrete wavelet,representing coefficients obtained via stacked discrete wavelet transforms, < >>Representing a high pass filter->At time->Is a value of (a).
In model design, building graph structure data is a directed graphGEach time sequence data is used as a node of the graph, and can be flexibly expressed as each sensor according to the prior informationiIs a set of candidate relationships of (a)C i I.e. the sensor it can rely on:
if there is no a priori information, the sensoriIs that all sensors except itself. To at these candidate nodesPoint selection sensoriIs to calculate the nodeiIs embedded vector of (a) and candidate node thereofjC i Similarity between the embedding of (a):,/>
computing sensoriIs related to the candidate by the embedded vector of (a)j∈C i Normalized dot product betweenSelecting the uppermost onekThe normalized dot product of: here, theTopKRepresenting the uppermost of its inputs (i.e., normalized dot product)kIndex of the individual values.
The anomaly detection model is shown in fig. 2, and the network consists of a feature extraction layer, a space-time network and a prediction layer. First, the convolutional layer CNN is used to take over training data to achieve mining of feature dimensions. Then, through the space-time network module ST-block based on the combination of the graph convolution layer GCN and the transducer model, not only the time dependence in each time sequence is coded, but also the interrelation between different time sequence pairs is coded, and finally, the convolution layer and the VAE are connected as a prediction layer to realize the anomaly detection model based on prediction.
The overall prediction task can be expressed as:the method comprises the steps of carrying out a first treatment on the surface of the Wherein,Gin order to create the structure of the graph,Fand T is the dimension of a time window, and S is the dimension of the prediction output.
Considering the predicted effect and the effect after VAE reconstruction, the model training function is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Is the square error of the output value and the actual value, < >>Representing the difference between the learned potential variable distribution and the standard normal distribution, ++>The weights used to balance MSE loss and KL divergence loss are represented.
In the abnormal scoring stage, for the N-dimensional T long sensor running state sequence, inputting an abnormal detection model to obtain a predicted result, and comparing and calculating with test set data, wherein the scoring calculation is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein,is the square error of the predicted value and the actual value. After the error is calculated, it is necessary to determine whether the current point is abnormal or not. And the 95% confidence level is used as a threshold value, and the anomaly detection is performed based on the fractional number of the reconstruction error mean value of the training set, so that high accuracy and reliability are ensured.
Specific:
step 1: raw data are acquired and preprocessed: in this step, in order to avoid the influence of the extreme value of the data on the model, the stability of model training is enhanced, the speed of model learning is improved, and each list can be expressed asX= {x 1 ,x 2 ,...,x n },Each sequence data will be normalized by classifying all data into [0,1 ]]Between:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the time sequence +.>Normalized results, < > and->Indicate->Time series,/->Representing the time sequence +.>Mean value of->Representing the time sequence +.>Standard deviation of (2).
Will beFrequency components are obtained through stacking discrete wavelet transform (MODET), three-dimensional data (C, T, N) are constructed, C is the number of sequences obtained after discrete wavelet decomposition corresponding to the number of channels, T is the length of a predicted time window, and N is the number of multiple time sequences.
The formula can be expressed as:the method comprises the steps of carrying out a first treatment on the surface of the Wherein x represents the original input multi-sequence data, w represents the discrete wavelet basis function, represents the coefficient obtained by stacking discrete wavelet transformation, and represents the value of the high-pass filter at the moment. The exploded effect is shown in fig. 3-7, where time (seconds, s) is plotted on the abscissa and amplitude is plotted on the ordinate.
Step 2: the anomaly detection model is shown in fig. 2, and the graph network model is processed by adopting two convolution CNN layers. The first layer uses a smaller convolution kernel (k1×k1) to extract local feature information and amplify the channel number, the second layer uses a larger convolution kernel (k2×k2) to capture more global feature information and enhance the channel number, and multi-layer feature dimension mining of input data is realized, wherein the form is expressed as:
the model structure diagram of the ST-block of the spatio-temporal network module is shown in fig. 2. After passing through the feature processing layer, the method enters a space-time network module. In this embodiment, the network module is composed of a spatio-temporal position embedding layer (locationEmbedding layer), a graph roll layer (GraphConvolutionalNetwork, GCN) and a transducer, the two layers of GCN networks are connected in parallel with the transducer model, and the long-distance time dependence of the time sequence and the characteristic relation between the multiple sequences are respectively captured, so that the time sequence data is remodelled.
Spatio-temporal position embedding layer (positionally embedding layer): the transducer cannot capture observed spatial and temporal information using a fully connected feed forward structure. Therefore, it is necessary to first embed the position, inject the length of time into the input sequence,tconstruction of position vectors by trigonometric functionspIn which the position istCorresponding position vectorpIs constructed by the way of position vectorpThe input sequence added to the input sequence to obtain the band position information is as follows:
graph convolution layer (GCNLlayer) acquires adjacency matrixASum sequence inputXThe GCN model constructs a filter in the fourier domain. The filters act on the nodes of the graph, capturing the spatial features between the nodes through their first order neighbors, and then building a GCN model by superimposing multiple convolution layers, expressed as:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>In order to add a matrix of self-connections,Iis a unitary matrix->Degree matrix,/>,/>Is->Output of layer->Representation->Parameters contained in the layer->Sigmoid activation functions representing nonlinear models. In general, it is difficult for a single-layer graph convolution network to adequately capture the dependency between features.
Therefore, the model is provided with a double-layer GCN network for deep mining, and can be expressed as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein the first layer selects an activation function asReLUA function for filtering the input information,W 0W 1 respectively, a weight matrix between the tiers.
The GCN handles timing issues by mining spatial relationships through graph convolution operations, mining only smooth spatial dependencies. The transducer model uses time as a main section for processing the time dependence of the multivariate time sequence data, the model design consists of a multi-head attention mechanism and a feedforward network, and the multivariate time sequence data modeled by the time dimension and the characteristic dimension are respectively detected through the multi-head attention to obtain the relation between the time hidden information; three matrices MQ, MK, MV are respectively denoted as query matricesQKey matrixKSum matrixVThe method comprises the steps of carrying out a first treatment on the surface of the The self-attention calculation formula is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein: />Representing a Softmax function, mapping the obtained weights to [0,1 ]]Between, for normalizing the spatial correlation, +.>The scaling weights avoid oversaturation of the Softmax function. Finally, based on the residual connection design, < > and->Obtaining the output O of the module through a double-layer feedforward network, and adopting ReLUThe function is used as an activation function and has the following form: />
The fusion of the parallel network data is realized by introducing a door mechanism, and the output value is as follows:;/>the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Mapping of one-dimensional vectors for single-layer linear network>Is the output value of the space-time block ST-block.
Finally, willAnd (3) inputting a prediction layer for dimension reduction and reconstruction, outputting an experimental result, wherein the prediction layer is composed of a convolution layer and a VAE model, the convolution layer realizes dimension reduction of the channel number to a single channel, the VAE carries out reconstruction on the time dimension, and an output data form (B, N, T2) is constructed.
Step 3: and (3) data acquisition: the dataset is a time series of distributed sensor power data synthesis containing 19 electric fields. Taking 30 minutes as a sampling point, forming a multi-time sequence with 9125 time steps, wherein the sequence comprises 5 preset abnormal intervals, and the abnormal time length is set to be 30, 60 or 90 time steps.
Step 4: model training: dividing the data set according to the predicted input time dimension T1 and the output time dimension T2 by the preprocessed training data, constructing the training data, and dividing by adopting a sliding window with the size of T1 and the stride of 1; each segment can be expressed asThe set of all segments is denoted +.>Wherein T1 is set to 6 by default and T2 is set to 1 by default. To consider batch processing of neural network training, single training data is constructed as a standard input with dimensions (B, C, T1, N), where B is the number of batches of single training, C corresponds to the number of approximate sequences J based on wavelet decomposition, T1 is the input time dimension, and N is the number of multiple sequences. Similarly, the output from the normalized raw sequence construct is (B, T2, N), the number of training of a single epoch is +.>Where L is the length of the sample sequence. And finally, sending the constructed sample sequence into a model, and carrying out network parameter training. Model training uses the root mean square error MSE as a loss function, in the form: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>For predictive value +.>For the sample true value, +.>Representing the mean and variance sampled from the potential vector.
Step 5: abnormality detection: model trainingAnd after the training is finished, inputting the test set data into an anomaly detection model to obtain a predicted value of the model. And calculating an anomaly score based on the multivariate Gaussian distribution, taking a Markov distance function as a measurement standard of the anomaly score, and comprehensively considering covariance among various features and the distance from the data point to the feature mean.The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Feature vector representing data point +_>Mean vector representing multiple nodes ++>Covariance matrix representing node vector, +.>Representing data points to mean vector +.>Is a mahalanobis distance.
In order to determine the threshold value of the abnormality score, the present embodiment sets a key value associated with the feature quantity at a 95% confidence level based on chi-square distribution theory for determining the level of the abnormality score depending on the abnormality score of the training data. This threshold represents the value that the anomaly score needs to exceed at a given confidence level, and the present embodiment will detect which data points have anomaly scores above the set threshold to determine if they are outlier data points, as shown in FIG. 8.
Table 1 shows the results of performance comparisons of this example with the other three methods on the same multi-time series dataset, using three metrics, precision, recall and F1Score, to evaluate the anomaly detection performance of each method. The same Threshold (Threshold) was set at 27.2036, epoch was set at 20 generations, experiments on the same dataset were repeated 10 times, and the average result was output for comparison.
Table 1: comparison results of this model with other reference models
It can be seen that the index of MSE indicates that the model of this embodiment has significantly better predictive performance than the other three methods, and that the combined F1score indicates that this embodiment is more competitive in correctly identifying anomalies.
The process is based on a multi-element time sequence electric power Internet of things data anomaly detection method of a discrete wavelet decomposition and graph convolution neural network, and anomaly detection is carried out by combining multi-element Gaussian distribution. Discrete wavelet decomposition helps to mine information of different frequencies of the original sequence, thereby grasping global trends and local details. The time-frequency characteristics of the data can be more comprehensively understood through multi-scale decomposition of the sequence, and a richer data information basis is provided for subsequent network input.
The anomaly identification section calculates anomaly scores based on the multivariate gaussian distribution, can comprehensively characterize joint probability distribution of multivariate data, and determines thresholds of the anomaly scores by setting confidence levels. Compared with the traditional threshold setting method, the method is more objective, has stronger interpretation and provides more reliable basis for abnormality detection.
By providing a space-time network model based on graph convolution and a transducer, the dependence relation between features can be self-adaptively mined, and the evolution rule of time sequence measurement data can be self-adaptively found, so that the accurate multi-time sequence electric power Internet of things data anomaly detection can be realized. Model design takes advantage of the characteristics of graph roll-up neural networks (GCN) and transducer models. GCN is mainly responsible for mining spatial characteristics as a spatial network, while a transducer focuses on mining time trend through a attention mechanism and then outputs self-adaptive weights through a gating mechanism so as to be fused together, so that a powerful encoder is formed. A variational self-encoder (VAE) model is further introduced that adaptively learns the distribution of sequence data, more conducive to more accurately identifying outliers than feed-forward networks. The VAE acts as a decoder providing more powerful support for anomaly detection.
Embodiment two:
an electric power thing allies oneself with data anomaly detection system includes:
a data acquisition module configured to: acquiring and preprocessing power time sequence data, decomposing to obtain frequency components, corresponding channel numbers, sequence numbers and predicted time window lengths, and forming a data set;
a feature mining module configured to: the obtained data set is based on the trained abnormality detection model, the characteristic relation between the long-distance time dependence of the time sequence and the multiple sequences is respectively captured, and the predicted value is obtained by reconstruction according to the time sequence data;
an anomaly detection module configured to: and determining an abnormality score by using the obtained predicted value through a multi-element Gaussian distribution, taking the mahalanobis distance as a measurement standard of the abnormality score, and taking the corresponding data point as an abnormality when the abnormality score exceeds a threshold value.
The system combines the multi-element Gaussian distribution to perform anomaly detection, can self-adaptively excavate the dependency relationship among the characteristics and the evolution rule of the time sequence measurement data, realizes accurate multi-element time sequence electric power Internet of things data anomaly detection, and realizes multi-scale time-frequency characteristic extraction of the electric power data by introducing a decomposition technology, thereby effectively capturing the local detail and the global trend of the data and providing richer multi-variable time sequence information.
Embodiment III:
the present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the electric power internet of things data anomaly detection method described in the above embodiment.
Embodiment four:
the embodiment provides a computer device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps in the method for detecting the abnormality of the electric power internet of things data according to the embodiment.
The steps involved in the second to fourth embodiments correspond to the first embodiment, and the detailed description of the second embodiment refers to the related description section of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media including one or more sets of instructions; it should also be understood to include any medium capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any one of the methods of the present invention.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The method for detecting the abnormality of the electric power internet of things data is characterized by comprising the following steps of:
acquiring and preprocessing power time sequence data, decomposing to obtain frequency components, corresponding channel numbers, sequence numbers and predicted time window lengths, and forming a data set;
the obtained data set is based on the trained abnormality detection model, the characteristic relation between the long-distance time dependence of the time sequence and the multiple sequences is respectively captured, and the predicted value is obtained by reconstruction according to the time sequence data;
and determining an abnormality score by using the obtained predicted value through a multi-element Gaussian distribution, taking the mahalanobis distance as a measurement standard of the abnormality score, and taking the corresponding data point as an abnormality when the abnormality score exceeds a threshold value.
2. The method for detecting anomalies in electrical data on an internet of things of claim 1, wherein the preprocessing includes acquiring N electrical time series, and cleaning noise data as input to training.
3. The method for detecting the anomaly of the electric power internet of things data according to claim 1, wherein a frequency component is obtained by stacking discrete wavelet transform methods, and a data set (C, T, N) is constructed, wherein C is the number of sequences obtained after the discrete wavelet decomposition corresponding to the number of channels, T is the length of a predicted time window, and N is the number of multiple time sequences, as shown in the following formula:
where x represents the multi-sequence data representing the original input, w represents the discrete wavelet basis function,representing coefficients obtained via stacked discrete wavelet transforms, < >>Representing a high pass filter->At time->Is a value of (a).
4. The method for detecting the abnormality of the electric power internet of things data according to claim 1, wherein the abnormality detection model comprises a feature processing layer, a space-time network module and a prediction layer, the feature processing layer comprises a preprocessing layer and a convolution network, the space-time network module comprises a plurality of groups, each group of space-time network module comprises a double-layer graph convolution layer and a transform model which are connected in parallel, the prediction layer is reconstructed through the convolution layer and a variation self-encoder, the convolution layer and the linear layer respectively reduce the feature dimension, the variation self-encoder reconstructs the sequence and the time dimension, and the prediction result is output.
5. The method of claim 4, wherein the spatio-temporal network module includes a spatio-temporal position embedding layer, a picture convolution layer and a transform, and the time series data is reconstructed by encoding a time dependency in each time series and a correlation between different pairs of time series.
6. The method of claim 4, wherein the time length is injected into the input sequence by using a space-time position embedding layer, and the position vector is constructed based on a set function.
7. The method for detecting anomalies in electrical data over Internet of things of claim 4, wherein the graph convolution layer obtains an adjacency matrix and a sequence input, and constructs a filter in a Fourier domain, the filter constructs a graph convolution model by stacking multiple convolution layers through spatial features between its first-order neighborhood capture nodes.
8. The method for detecting anomaly in electrical data in an internet of things of claim 4, wherein the relationship between time hidden information is obtained by detecting multivariate time series data modeled by a time dimension and a feature dimension, respectively, with multiple attentions.
9. The method for detecting anomaly of electric power internet of things data according to claim 1, wherein a mahalanobis distance is used as a measure of anomaly score, as shown in the following formula:
wherein,feature vector representing data point +_>Mean vector representing multiple nodes ++>Covariance matrix representing node vector, +.>Representing data points to mean vector +.>Is a mahalanobis distance.
10. Electric power thing allies oneself with data anomaly detection system, its characterized in that includes:
a data acquisition module configured to: acquiring and preprocessing power time sequence data, decomposing to obtain frequency components, corresponding channel numbers, sequence numbers and predicted time window lengths, and forming a data set;
a feature mining module configured to: the obtained data set is based on the trained abnormality detection model, the characteristic relation between the long-distance time dependence of the time sequence and the multiple sequences is respectively captured, and the predicted value is obtained by reconstruction according to the time sequence data;
an anomaly detection module configured to: and determining an abnormality score by using the obtained predicted value through a multi-element Gaussian distribution, taking the mahalanobis distance as a measurement standard of the abnormality score, and taking the corresponding data point as an abnormality when the abnormality score exceeds a threshold value.
CN202410256677.5A 2024-03-07 2024-03-07 Power Internet of things data anomaly detection method and system Pending CN117851920A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410256677.5A CN117851920A (en) 2024-03-07 2024-03-07 Power Internet of things data anomaly detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410256677.5A CN117851920A (en) 2024-03-07 2024-03-07 Power Internet of things data anomaly detection method and system

Publications (1)

Publication Number Publication Date
CN117851920A true CN117851920A (en) 2024-04-09

Family

ID=90529424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410256677.5A Pending CN117851920A (en) 2024-03-07 2024-03-07 Power Internet of things data anomaly detection method and system

Country Status (1)

Country Link
CN (1) CN117851920A (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222145A (en) * 2021-06-04 2021-08-06 西安邮电大学 MODWT-EMD-based time sequence hybrid prediction method
CN114708665A (en) * 2022-05-10 2022-07-05 西安交通大学 Skeleton map human behavior identification method and system based on multi-stream fusion
CN115018021A (en) * 2022-08-08 2022-09-06 广东电网有限责任公司肇庆供电局 Machine room abnormity detection method and device based on graph structure and abnormity attention mechanism
CN115293205A (en) * 2022-08-08 2022-11-04 杭州海康威视数字技术股份有限公司 Anomaly detection method, self-encoder model training method and electronic equipment
US20220397874A1 (en) * 2021-06-07 2022-12-15 Zhejiang University Non-Intrusive Load Decomposition Method Based on Informer Model Coding Structure
CN115510975A (en) * 2022-09-28 2022-12-23 山东省计算中心(国家超级计算济南中心) Multivariable time sequence abnormality detection method and system based on parallel Transomer-GRU
CN115618196A (en) * 2022-10-18 2023-01-17 湖北工业大学 Transformer anomaly detection method based on space-time characteristics
CN115688035A (en) * 2022-10-19 2023-02-03 江苏电力信息技术有限公司 Time sequence power data anomaly detection method based on self-supervision learning
CN115809747A (en) * 2023-02-06 2023-03-17 东南大学 Pyramid cause-and-effect network-based coupling information flow long-term prediction method
CN116186633A (en) * 2023-03-06 2023-05-30 国网江苏省电力有限公司营销服务中心 Power consumption abnormality diagnosis method and system based on small sample learning
CN116663613A (en) * 2023-04-03 2023-08-29 山东大学 Multi-element time sequence anomaly detection method for intelligent Internet of things system
CN116680105A (en) * 2023-05-31 2023-09-01 南京大学 Time sequence abnormality detection method based on neighborhood information fusion attention mechanism
CN117131452A (en) * 2023-08-29 2023-11-28 国网山东省电力公司信息通信公司 Abnormality detection method and system based on normalized flow and Bayesian network
CN117290800A (en) * 2023-11-24 2023-12-26 华东交通大学 Timing sequence anomaly detection method and system based on hypergraph attention network
CN117313015A (en) * 2023-11-28 2023-12-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Time sequence abnormality detection method and system based on time sequence and multiple variables
CN117574275A (en) * 2023-11-28 2024-02-20 国网山东省电力公司电力科学研究院 Abnormality detection and classification method for power system
CN117596191A (en) * 2023-12-05 2024-02-23 广东电网有限责任公司 Power Internet of things abnormality detection method, device and storage medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222145A (en) * 2021-06-04 2021-08-06 西安邮电大学 MODWT-EMD-based time sequence hybrid prediction method
US20220397874A1 (en) * 2021-06-07 2022-12-15 Zhejiang University Non-Intrusive Load Decomposition Method Based on Informer Model Coding Structure
CN114708665A (en) * 2022-05-10 2022-07-05 西安交通大学 Skeleton map human behavior identification method and system based on multi-stream fusion
CN115018021A (en) * 2022-08-08 2022-09-06 广东电网有限责任公司肇庆供电局 Machine room abnormity detection method and device based on graph structure and abnormity attention mechanism
CN115293205A (en) * 2022-08-08 2022-11-04 杭州海康威视数字技术股份有限公司 Anomaly detection method, self-encoder model training method and electronic equipment
CN115510975A (en) * 2022-09-28 2022-12-23 山东省计算中心(国家超级计算济南中心) Multivariable time sequence abnormality detection method and system based on parallel Transomer-GRU
CN115618196A (en) * 2022-10-18 2023-01-17 湖北工业大学 Transformer anomaly detection method based on space-time characteristics
CN115688035A (en) * 2022-10-19 2023-02-03 江苏电力信息技术有限公司 Time sequence power data anomaly detection method based on self-supervision learning
CN115809747A (en) * 2023-02-06 2023-03-17 东南大学 Pyramid cause-and-effect network-based coupling information flow long-term prediction method
CN116186633A (en) * 2023-03-06 2023-05-30 国网江苏省电力有限公司营销服务中心 Power consumption abnormality diagnosis method and system based on small sample learning
CN116663613A (en) * 2023-04-03 2023-08-29 山东大学 Multi-element time sequence anomaly detection method for intelligent Internet of things system
CN116680105A (en) * 2023-05-31 2023-09-01 南京大学 Time sequence abnormality detection method based on neighborhood information fusion attention mechanism
CN117131452A (en) * 2023-08-29 2023-11-28 国网山东省电力公司信息通信公司 Abnormality detection method and system based on normalized flow and Bayesian network
CN117290800A (en) * 2023-11-24 2023-12-26 华东交通大学 Timing sequence anomaly detection method and system based on hypergraph attention network
CN117313015A (en) * 2023-11-28 2023-12-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Time sequence abnormality detection method and system based on time sequence and multiple variables
CN117574275A (en) * 2023-11-28 2024-02-20 国网山东省电力公司电力科学研究院 Abnormality detection and classification method for power system
CN117596191A (en) * 2023-12-05 2024-02-23 广东电网有限责任公司 Power Internet of things abnormality detection method, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
戚琦;申润业;王敬宇;: "GAD:基于拓扑感知的时间序列异常检测", 通信学报, no. 06, 24 June 2020 (2020-06-24) *

Similar Documents

Publication Publication Date Title
Liang et al. Robust unsupervised anomaly detection via multi-time scale DCGANs with forgetting mechanism for industrial multivariate time series
Del Testa et al. Lightweight lossy compression of biometric patterns via denoising autoencoders
Wang et al. Variational transformer-based anomaly detection approach for multivariate time series
Guo et al. Automatic feature extraction using genetic programming: An application to epileptic EEG classification
Wang et al. Research on Healthy Anomaly Detection Model Based on Deep Learning from Multiple Time‐Series Physiological Signals
Wu et al. FastDTW is approximate and generally slower than the algorithm it approximates
Azzalini et al. A minimally supervised approach based on variational autoencoders for anomaly detection in autonomous robots
Zhao et al. A novel multivariate time-series anomaly detection approach using an unsupervised deep neural network
Nizam et al. Real-time deep anomaly detection framework for multivariate time-series data in industrial iot
Wang et al. Decision tree based control chart pattern recognition
Anisha et al. Early Prediction of Parkinson's Disease (PD) Using Ensemble Classifiers
Schneider et al. Detecting anomalies within time series using local neural transformations
CN116522265A (en) Industrial Internet time sequence data anomaly detection method and device
Hasan et al. Wasserstein GAN-based Digital Twin Inspired Model for Early Drift Fault Detection in Wireless Sensor Networks
CN115587335A (en) Training method of abnormal value detection model, abnormal value detection method and system
Yokkampon et al. Robust unsupervised anomaly detection with variational autoencoder in multivariate time series data
Xie et al. Intelligent analysis of premature ventricular contraction based on features and random forest
Ashraf et al. A survey on dimensionality reduction techniques for time-series data
Tahan et al. Development of fully convolutional neural networks based on discretization in time series classification
Gao et al. A Novel Fault Detection Model Based on Vector Quantization Sparse Autoencoder for Nonlinear Complex Systems
CN117851920A (en) Power Internet of things data anomaly detection method and system
JP2023133231A (en) Method for detecting anomaly in time-series data produced by device of infrastructure in network
Lu et al. Weak monotonicity with trend analysis for unsupervised feature evaluation
CN115758273A (en) Method, device, equipment and medium for detecting time sequence data abnormity
Wu et al. Genetic-algorithm-based Convolutional Neural Network for Robust Time Series Classification with Unreliable Data.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination