CN111507374A - Power grid mass data anomaly detection method based on random matrix theory - Google Patents

Power grid mass data anomaly detection method based on random matrix theory Download PDF

Info

Publication number
CN111507374A
CN111507374A CN202010090430.2A CN202010090430A CN111507374A CN 111507374 A CN111507374 A CN 111507374A CN 202010090430 A CN202010090430 A CN 202010090430A CN 111507374 A CN111507374 A CN 111507374A
Authority
CN
China
Prior art keywords
data
matrix
power grid
abnormal
window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010090430.2A
Other languages
Chinese (zh)
Inventor
孙宬
刘文颖
王维洲
王方雨
张柏林
陈鑫鑫
邵冲
夏鹏
刘福潮
张雨薇
何欣
张尧翔
王耿
胡阳
史玉杰
朱丽萍
李潇
郇悦
张雯程
刘紫东
曾贇
杨美颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
North China Electric Power University
State Grid Gansu Electric Power Co Ltd
Electric Power Research Institute of State Grid Gansu Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
North China Electric Power University
State Grid Gansu Electric Power Co Ltd
Electric Power Research Institute of State Grid Gansu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, North China Electric Power University, State Grid Gansu Electric Power Co Ltd, Electric Power Research Institute of State Grid Gansu Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202010090430.2A priority Critical patent/CN111507374A/en
Publication of CN111507374A publication Critical patent/CN111507374A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Water Supply & Treatment (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a power grid mass data anomaly detection method based on a random matrix theory, which can be used for detecting a large amount of incomplete and inconsistent dirty data caused by equipment faults, data stream acquisition errors, external noise disturbance and other factors, wherein three-phase voltage and current are used as analysis indexes, a Random Matrix Theory (RMT) is used as a basis, linear characteristic value statistics (L ES) is used as a statistical index after source data are processed, and the characteristics of elements in a data matrix are continuously analyzed, L ES can reflect some statistical rules of the matrix, so that the abnormal data content in a space-time source data matrix can be represented by the fluctuation degree of the matrix within a period of time.

Description

Power grid mass data anomaly detection method based on random matrix theory
Technical Field
The invention relates to the problem of abnormal detection of power distribution network operation data, in particular to a power grid mass data abnormal detection method based on a random matrix theory, and belongs to the technical field of power grid management control.
Background
With the continuous expansion of the scale of a power distribution network in China, provincial interconnection, regional interconnection and national interconnection become a necessary development trend, meanwhile, the number of power grid lines is increased, the network structure is increasingly complex, the quantity of real-time massive operation data acquired by equipment is huge, and because the intelligent power grid is influenced by equipment faults, data stream acquisition errors, external noise disturbance and the like, the initially acquired massive power grid operation data has a large quantity of incomplete and inconsistent dirty data, data mining cannot be directly performed, or the mining result is poor. In order to improve the quality of data mining, initial detection of data is required. Meanwhile, the electric power big data theory is increasingly perfected, the power grid operation state on-line monitoring system can collect power grid operation parameters in time and in a centralized manner, and the data meet 5 characteristics of big data: the method has the advantages of large data Volume (Volume), high processing speed (Velocity), multiple data types (Variety), high Value (Value) and high accuracy (Veracity), and contains a large amount of valuable information related to the operation of the power distribution network.
Theoretical studies of abnormal data mining formally entered the field of vision of people in 1887, started with a paper by the british statistician francisco iderolo. With the deep research on abnormal data mining, a plurality of abnormal detection technologies appear and are widely applied to practical engineering. However, as big data theory develops, the detection of abnormal data develops relatively late compared to data mining. The earliest methods applied to anomaly diagnosis were statistical-based methods that relied on data streams satisfying a certain standard distribution, i.e., the method was defined by probability distribution, such as Yamannishi et al, which describes normal behavior using a gaussian mixture model, and finds diagnostic anomalies by calculating the degree of deviation between target data and the model standard state, but such methods have significant limitations because it was not known in advance what standard distribution the data set under study satisfies; towel et al have proposed an abnormal data mining algorithm based on a neural network on the basis of analyzing the neural network, but problems in the aspects of poor generalization capability of the neural network, the need of expert experience in constructing the network, and the like bring certain problems to the application of the model. The classification method based on the kernel is an algorithm developed in recent years and used for abnormal data mining, and the main idea is that target source data is mapped to a high-dimensional feature space through a functional relation, and a classification model can be established according to a classification hyperplane of the high-dimensional feature space so as to distinguish abnormal data.
The research on abnormal data mining in China starts relatively late, but in recent years, a plurality of important research achievements have been obtained, and the automatic monitoring on the abnormal state of the mass data flow of the power grid is mainly realized by setting a threshold value or based on theories such as wavelet analysis, an artificial neural network, a support vector machine and the like.
In conclusion, many abnormal data mining algorithms are provided at home and abroad aiming at abnormal data detection, but the method has the problems of low detection speed, low accuracy and complicated model establishment, and the method for detecting the abnormal mass data of the power grid based on the random matrix theory is provided aiming at the problems, has certain engineering value and research interest and has guiding significance for management and decision-making departments.
Disclosure of Invention
The invention mainly solves the technical problem of abnormal data in the power grid mass voltage and current operation data, provides a power grid mass data abnormality detection method based on a random matrix theory, can accurately and quickly identify the abnormality of the power grid mass voltage and current data, can effectively solve the problems of complexity of a model to be designed, inaccurate model design and the like, and has higher real-time performance and accuracy.
In order to solve the technical problems, the invention adopts a technical scheme that:
step 1: the method comprises the steps of determining an area needing to be detected, obtaining massive power grid operation Data under a feeder line and a branch line of the area in real time through a Supervisory control and Data Acquisition (SCADA) system, selecting three-phase voltage and three-phase current as analysis samples from the operation Data, requiring that the time scales of the samples are consistent, selecting 380V feeder lines in a certain area and three-phase voltage and three-phase current under the branch line of the feeder line as the analysis samples, and forming a space-time source Data matrix D, wherein the form of the space-time source Data matrix D is shown in a table.
TABLE 1 spatio-temporal source data matrix D form
Figure RE-RE-GDA0002562654890000031
Written in matrix form as
Figure RE-RE-GDA0002562654890000041
Wherein DijAnd (i is 1, 2, …, p, j is 1, 2, … n) is the three-phase voltage quantity or the three-phase current quantity corresponding to the j time of the ith distribution transformer.
Step 2: obtained in step 1
Figure RE-RE-GDA0002562654890000042
Can seeD is a real matrix of 6p × n, let Dw∈Rp×nFor data matrix DwAccording to
Figure RE-RE-GDA0002562654890000043
Standardized, processed matrix notation
Figure RE-RE-GDA0002562654890000044
And (4) showing. To better satisfy the RMT analysis condition, gaussian white noise, i.e. white gaussian noise, is usually added moderately due to the weak correlation between matrix rows and analysis errors
Figure RE-RE-GDA0002562654890000045
To add the noisy matrix, the signal-to-noise ratio is usually taken to be large, W ∈ Rp×n
And step 3: setting sampling time t of moving window methodjThe data sampling interval is 15min, 96 points are sampled every day, the moving step length is set to be 1, and the window size is large. In order to meet the application condition of RMT, the window data obtained from D to form the data matrix should satisfy the requirement that the length is greater than the width.
And 4, step 4: computing a covariance matrix
Figure RE-RE-GDA0002562654890000046
Calculating the eigenvalue of the covariance matrix; for convenience of analysis, the characteristic values are normalized to be uniform between (0, 1), that is, the characteristic values are normalized
Figure RE-RE-GDA0002562654890000047
Where λ (i) is the actual eigenvalue calculation and p is the covariance matrix dimension.
When P, n → ∞ and C ═ P/n ∈ (0, 1), the ESD of the covariance matrix S converges to the following Probability Density Function (PDF) according to the M-P law.
Figure RE-RE-GDA0002562654890000051
The analysis shows that the Empirical Spectrum Distribution (ESD) of S does not obey the M-P law distribution when abnormal data exist.
Step 6, calculating a linear eigenvalue statistic L ES, and according to the eigenvalues obtained in step 4 and the conclusion obtained in step 5, it can be known that the ESD containing abnormal data is different from the ESD under normal conditions, and can reflect the data behavior (normal or abnormal) in the matrix to some extent, and the data behavior is defined as
Figure RE-RE-GDA0002562654890000052
Wherein λi(i ═ 1, 2, … n) are the n eigenvalues of the covariance matrix S,
Figure RE-RE-GDA0002562654890000053
is a test function, typically using the Chebyshev polynomial, Shannon-Entrophy, Wasserstein Distance, etc. In the invention, Shannon-Entrophy is selected, and the calculation formula is
Figure RE-RE-GDA0002562654890000054
And 7, taking a 380V feeder line L1 as an analysis object, sequentially calculating L ES values at each moment in a period of time, drawing a L ES-t curve, analyzing and explaining the characteristics containing abnormal data from the curve, and verifying by combining with an actual engineering example.
Drawings
FIG. 1 is a specific flowchart of a method for identifying abnormal operation data of a power grid according to the present invention;
FIG. 2 is a scale of abnormal data contained in a case used in the present invention;
FIG. 3 is a corresponding L ES-t simulation curve obtained for different abnormal data ratios in the present invention;
FIG. 4 is a diagram illustrating the L ES-t curve recovered by the present invention after partial anomaly data has been modified;
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.
The invention provides a power grid mass data anomaly detection method based on a random matrix theory, which selects mass voltage and current real-time operation data detected by all distribution transformers under 380V feeder lines and branch lines in a certain area for simulation verification, and adopts the following specific technical implementation scheme:
1. A380V feeder line in the area is selected, in the example, 10 distribution transformers are arranged under the selected feeder line, operation data in one week are selected, 96 data can be sampled by the equipment every day, and therefore a space-time source data matrix D generated by taking three-phase voltage quantity and three-phase current quantity as analysis indexes is a matrix of 60 × 672 order.
2. According to the calculation method of the technical scheme provided by the invention in the steps 2 and 3, firstly, window parameters are set, a starting point is defined as a timing zero point, a data sampling interval is 15min, 96 points are sampled every day, a moving step length is set to be 1, the window size is set to be 60 × 192, namely window data are obtained from D to form a data matrix with the 60 × 192 order.
3. According to the technical scheme step 4, a covariance matrix S (t) is calculated corresponding to each sampling momentj),S(tj) 60 × 60 order matrix, characteristic values of covariance matrix at a certain time and results after normalization are shown in the following table
TABLE 2
Figure RE-RE-GDA0002562654890000061
4. Selecting Shannon-Entrophy as test function according to calculation formula
Figure RE-RE-GDA0002562654890000062
When the power distribution network operates, the operation data of the power distribution network collected by the monitoring control and data acquisition system continuously increases along with the increase of the operation time, the contained abnormal data also increases, and the ordered degree of data distribution also decreases along with the increase of the operation timeIt is also becoming increasingly difficult to detect anomalous data in large amounts of voltage and current. The proportion of the contained abnormal data is shown in fig. 2.
The L ES-t simulations obtained for the three cases of anomalous data involved are shown in FIG. 3.
It can be seen that the correlation between the operation data does not change substantially when no abnormal data is contained, and L ES can reflect the property to a certain extent, and the change is more gradual from L ES on the image, the fluctuation amplitude of the L ES-t curve is larger with the higher content of the abnormal data, for the graph (b), it can be seen that at the corresponding sampling time tiAfter the curve is suddenly changed at 60, the parameters set by the moving window method can roughly determine that the abnormal data exists in the 60 th column of the space-time source data matrix, and then the operation parameters near the 60 th column are corrected to obtain L ES-t curves again as shown in FIG. 4.
From FIG. 3, it can be seen that the L ES-t fluctuation amplitude at this time was significantly smaller than before unrepaired, and also the conclusion that the fluctuation amplitude of L ES was related to the proportion of contained abnormal data was confirmed.
6. The above description is only an embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent flow transformations made by using the contents of the description and drawings of the present invention, or directly or indirectly applied to other related technical fields, are similarly included in the patent protection of the present invention.

Claims (4)

1. The method for detecting the abnormal data of the power grid mass data based on the random matrix theory is characterized in that the abnormal data detection is carried out by adopting a method completely driven by data aiming at the power grid mass voltage and current operation data, and the method comprises the following steps:
step 1: and selecting a power grid mass voltage and current operation data of the power distribution network feeder and all distribution transformers under the branch lines of the power distribution network feeder in a period of time t, selecting three-phase voltage and three-phase current as analysis indexes, and forming a space-time source data matrix D.
Step 2: setting sampling time t of moving window methodjStep size, window size, signal-to-noise ratio, SNR. Obtaining window data from D to form data matrix Dw(tj)。
And step 3: for data matrix Dw(tj) According to the elements in
Figure RE-FDA0002536969440000011
Is subjected to standardization to obtain
Figure RE-FDA0002536969440000012
Wherein DijIs the value of the element in the ith row and the jth column of the matrix,
Figure RE-FDA0002536969440000013
and σiIs the i mean and standard deviation of the elements of the first row,
Figure RE-FDA0002536969440000014
normalized values for the corresponding elements.
And 4, step 4: computing a covariance matrix
Figure RE-FDA0002536969440000015
And 5: calculate S (t)j) For the next step L ES is calculated, the eigenvalues are normalized to be between (0, 1).
Step 6, the test function adopts Shannon. Encopy, L ES calculation formula as
Figure RE-FDA0002536969440000016
And 7, drawing L ES-t curves through simulation, and analyzing and comparing curve characteristics in the two conditions of abnormal data and abnormal data.
2. The method for detecting the abnormality of the power grid mass data based on the random matrix theory as claimed in claim 1, wherein the space-time source data matrix D in the step 1 is obtained through power grid mass voltage and current operation data. Analyzing the time sequence data by adopting a moving window method in the step 2, selecting the dimension of the window with the same size as that of the data matrix to meet the RMT application condition, and according to the sampling time tjAnd sequentially calculating linear data indexes of the moving window data to indicate the behavior (whether the abnormal condition exists) of the massive voltage and current operation data of the power grid.
3. The method for detecting the abnormal mass data of the power grid based on the random matrix theory as claimed in claim 1, wherein the standardized matrix obtained in the step 3
Figure RE-FDA0002536969440000021
In order to better satisfy the analysis condition of RMT, SNR is introduced to eliminate correlation between rows.
4. The method for detecting the abnormal data of the power grid mass data based on the random matrix theory as claimed in claim 1, wherein the process of detecting the abnormal data by the moving window method in the step 5 is a process of moving a window according to set parameters, L ES of the data matrix obtained by each sampling window is sequentially calculated, a L ES-t curve is drawn, the fluctuation range of L ES within a period of time is compared, so as to judge whether the abnormal data exists, the step 7 obtains the L ES-t curve taking an engineering data example as an example through simulation, and the proportion of the abnormal data contained in the space-time source data matrix is judged according to the smoothness degree of curve change.
CN202010090430.2A 2020-02-13 2020-02-13 Power grid mass data anomaly detection method based on random matrix theory Pending CN111507374A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010090430.2A CN111507374A (en) 2020-02-13 2020-02-13 Power grid mass data anomaly detection method based on random matrix theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010090430.2A CN111507374A (en) 2020-02-13 2020-02-13 Power grid mass data anomaly detection method based on random matrix theory

Publications (1)

Publication Number Publication Date
CN111507374A true CN111507374A (en) 2020-08-07

Family

ID=71863925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010090430.2A Pending CN111507374A (en) 2020-02-13 2020-02-13 Power grid mass data anomaly detection method based on random matrix theory

Country Status (1)

Country Link
CN (1) CN111507374A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348644A (en) * 2020-11-16 2021-02-09 上海品见智能科技有限公司 Abnormal logistics order detection method by establishing monotonous positive correlation filter screen
WO2023241326A1 (en) * 2022-06-14 2023-12-21 无锡隆玛科技股份有限公司 Power grid anomaly detection method based on maximum eigenvalue rate of sample covariance matrix

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348644A (en) * 2020-11-16 2021-02-09 上海品见智能科技有限公司 Abnormal logistics order detection method by establishing monotonous positive correlation filter screen
CN112348644B (en) * 2020-11-16 2024-04-02 上海品见智能科技有限公司 Abnormal logistics order detection method by establishing monotonic positive correlation filter screen
WO2023241326A1 (en) * 2022-06-14 2023-12-21 无锡隆玛科技股份有限公司 Power grid anomaly detection method based on maximum eigenvalue rate of sample covariance matrix

Similar Documents

Publication Publication Date Title
CN105425779B (en) ICA-PCA multi-state method for diagnosing faults based on local neighborhood standardization and Bayesian inference
CN110336534B (en) Fault diagnosis method based on photovoltaic array electrical parameter time series feature extraction
CN105700518B (en) A kind of industrial process method for diagnosing faults
CN109193650B (en) Power grid weak point evaluation method based on high-dimensional random matrix theory
CN109460574A (en) A kind of prediction technique of aero-engine remaining life
CN109816031B (en) Transformer state evaluation clustering analysis method based on data imbalance measurement
CN110458230A (en) A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method
CN109389325B (en) Method for evaluating state of electronic transformer of transformer substation based on wavelet neural network
CN117290802B (en) Host power supply operation monitoring method based on data processing
CN115409131B (en) Production line abnormity detection method based on SPC process control system
WO2023241326A1 (en) Power grid anomaly detection method based on maximum eigenvalue rate of sample covariance matrix
CN114062850B (en) Double-threshold power grid early fault detection method
CN116610998A (en) Switch cabinet fault diagnosis method and system based on multi-mode data fusion
CN111507374A (en) Power grid mass data anomaly detection method based on random matrix theory
CN111797533B (en) Nuclear power device operation parameter abnormity detection method and system
CN110751217B (en) Equipment energy consumption duty ratio early warning analysis method based on principal component analysis
CN110632455A (en) Fault detection and positioning method based on distribution network synchronous measurement big data
CN105516206A (en) Network intrusion detection method and system based on partial least squares
CN112947649B (en) Multivariate process monitoring method based on mutual information matrix projection
CN114597886A (en) Power distribution network operation state evaluation method based on interval type two fuzzy clustering analysis
CN107274025B (en) System and method for realizing intelligent identification and management of power consumption mode
CN117491813A (en) Insulation abnormality detection method for power battery system of new energy automobile
CN116990633A (en) Fault studying and judging method based on multiple characteristic quantities
CN106444706A (en) Industrial process fault detection method based on data neighborhood feature preservation
CN115828114A (en) Energy consumption abnormity detection method for aluminum profile extruder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200807