CN115357462A - Power dispatching abnormal data rapid detection method based on deep learning - Google Patents

Power dispatching abnormal data rapid detection method based on deep learning Download PDF

Info

Publication number
CN115357462A
CN115357462A CN202210972874.8A CN202210972874A CN115357462A CN 115357462 A CN115357462 A CN 115357462A CN 202210972874 A CN202210972874 A CN 202210972874A CN 115357462 A CN115357462 A CN 115357462A
Authority
CN
China
Prior art keywords
data
model
deep learning
telemetering
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210972874.8A
Other languages
Chinese (zh)
Inventor
袁伟
杜凡
高道春
莫熙
张馨介
朱余启
叶小虎
赵玉凯
王刚
王婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yuxi Power Supply Bureau of Yunnan Power Grid Co Ltd
Original Assignee
Yuxi Power Supply Bureau of Yunnan Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yuxi Power Supply Bureau of Yunnan Power Grid Co Ltd filed Critical Yuxi Power Supply Bureau of Yunnan Power Grid Co Ltd
Priority to CN202210972874.8A priority Critical patent/CN115357462A/en
Publication of CN115357462A publication Critical patent/CN115357462A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Remote Monitoring And Control Of Power-Distribution Networks (AREA)

Abstract

The invention relates to the technical field of abnormal data detection, in particular to a method for quickly detecting abnormal data in power dispatching based on deep learning. The method comprises the following steps: carrying out threshold value model processing after acquiring data; the measurement data detection method based on deep learning is adopted for detection: establishing a model; detecting data; constructing a long-short term memory model to detect abnormal data; and (4) detecting data reasonableness based on a CIM model. According to the design, a deep learning algorithm is adopted, the characteristics of power grid operation data, equipment parameters, an operation mode and the like are taken as constraint conditions, the rapid detection of the abnormal data of the power dispatching can be realized, and the detection result is pushed to an automatic operator in an alarm mode; through realizing the short-term test of electric power scheduling abnormal data, can greatly promote perception and the quick throughput to abnormal data, be convenient for regulation and control person directly perceivedly, control electric wire netting operating data comprehensively, can support the problem discovery and the analysis efficiency of the work of patrolling and examining of a large amount of online intelligence of transformer substation.

Description

Power dispatching abnormal data rapid detection method based on deep learning
Technical Field
The invention relates to the technical field of abnormal data detection, in particular to a method for quickly detecting abnormal data in power dispatching based on deep learning.
Background
The rapid detection of the abnormal data of the power dispatching can support the problem discovery and analysis efficiency of a large amount of on-line intelligent inspection work of the transformer substation.
The conventional processing technology mainly monitors and alarms the jump data, and usually searches and corrects the jump data after abnormal data is found in operation analysis or index calculation. The traditional power grid operation abnormal data detection method is mainly based on a processing method of state estimation, and comprises a residual error search method, a non-quadratic criterion method, a zero residual error method, an estimation identification method and the like. The disadvantages of the above-described method are reflected in: residual pollution and residual inundation may occur, thereby causing missed detection or false detection and affecting the identification effect. Because a nonlinear residual error equation is adopted, multiple state estimations are needed in the identification process, so the calculated amount is extremely large; in addition, when a plurality of bad data occurs, a phenomenon of erroneous recognition often occurs using this method.
After the system acquires data, a first step is to process a threshold model, wherein the threshold model is a model which is arranged by combining specified specifications and experience training in production work, and the like, for example: the threshold value of the bus voltage of 35kV and above is [7% -3% ], and if the bus voltage of 35kV is 38kV and exceeds the threshold limit, the data is judged to be abnormal data. When the abnormal data is detected, the abnormal data is timely informed to the user. However, in production, there is abnormal data that cannot be judged by the threshold model, such as: the telemetering data is displayed normally, but the nearby switches are in an off state, and the telemetering data reported at the moment is abnormal. Through the deep learning model, abnormal data alarms can be performed for the situations including but not limited to the situations.
In view of this, we propose a power scheduling abnormal data rapid detection method based on deep learning.
Disclosure of Invention
The invention aims to provide a method for quickly detecting abnormal data in power dispatching based on deep learning, so as to solve the problems in the background technology.
In order to solve the technical problem, an object of the present invention is to provide a method for quickly detecting abnormal data in power scheduling based on deep learning, including the following steps:
s1, after data are acquired, threshold value model processing is carried out;
s2, aiming at abnormal data which cannot be judged by a threshold model but not limited, a measurement data detection method based on deep learning is adopted for detection:
s2.1, establishing a model: constructing a deep learning model with supervised learning, which comprises a telemetering data characteristic model, a telemetering data classification model and a mutual verification process between the telemetering data characteristic model and the telemetering data classification model;
s2.2, data detection: taking the built deep learning model as a core, mainly detecting the telemetering data to find abnormal measurement data;
s3, constructing a long-short term memory model to detect abnormal data, and solving the problems of gradient loss and gradient explosion in the long time sequence training process;
s4, detecting data rationality based on the CIM model: according to the characteristics of the power grid dispatching service, based on the CIM model, the method pays attention to and detects 7 indexes including but not limited to total station voltage loss, bus voltage abnormity, main transformer input and output unbalance, remote signaling signal abnormity, long-distance reactive power transmission, data non-refreshing and channel interruption.
As a further improvement of the technical solution, in S1, the acquired data mainly refers to telemetry data in "four remote" data which is the most important data in power scheduling automation, and the telemetry data is real-time data of power grid operation acquired by using a telemetry function of a telemechanical device in a power system and is transmitted to a power scheduling master station through the telemechanical device, and is used as basic data input of a power scheduling automation system to monitor an operation state of the power system.
The four remote modes comprise remote signaling, remote measuring, remote control and remote regulation, and are mainly realized by the cooperation of a Remote Terminal Unit (RTU) of a transformer substation or a power plant and an electric power dispatching master station; wherein:
the remote signaling mainly collects and transmits action information of relay protection of the power system, state information of the circuit breaker and the like; the remote measurement mainly collects and transmits real-time data of the operation of the power system; remote control mainly comprises sending a command for changing the condition of running equipment from a dispatching center; remote regulation is mainly realized by sending commands from a dispatching center to remotely regulate the operating parameters of a power plant or a transformer substation.
As a further improvement of the technical solution, in S2.1, the specific method for establishing the model includes the following steps:
s2.1.1, establishing a telemetry data characteristic model:
s2.1.1.1, acquiring historical telemetering data;
s2.1.1.2, preprocessing the historical telemetering data;
s2.1.1.3, extracting data characteristics of the preprocessed data;
s2.1.1.4, constructing a telemetering data characteristic model according to the extracted data characteristics;
s2.1.2, establishing a telemetry data classification model:
s2.1.2.1, obtaining telemetering data after service processing;
s2.1.2.2, preprocessing the telemetering data after the service processing;
s2.1.2.3, performing data set characteristic extraction on the preprocessed data;
s2.1.2.4, constructing a telemetering data classification model according to the extracted data set characteristics;
s2.1.3, the telemetry data characteristic model and the telemetry data classification model verify each other:
s2.1.3.1, constructing a deep learning model;
s2.1.3.2, taking the deep learning model as the core, and respectively performing data feature extraction, data set feature extraction, service scene training and verification operation;
s2.1.3.3, the telemetry data feature model, verifies the telemetry data classification model through business scenario training and verification operations.
As a further improvement of the technical solution, in S2.2, the specific method for detecting data includes the following steps:
s2.2.1, acquiring real-time telemetering data;
s2.2.2, preprocessing the real-time telemetering data;
s2.2.3, importing the preprocessed real-time telemetering data into a telemetering data characteristic model;
s2.2.4, performing data characteristic analysis;
s2.2.5, performing classification characteristic analysis;
s2.2.6, importing the classification feature analysis result into a telemetry data classification model;
s2.2.7, and finally outputting an abnormal detection result through analysis and judgment of the model.
As a further improvement of the technical solution, in S2.1.1.2, S2.1.2.2 and S2.2.2, the method for preprocessing data includes, but is not limited to, data cleaning, data dimension reduction and data normalization; wherein:
the data cleaning is to fill in missing values of data in a data set to be processed, eliminate noise in the data, remove invalid data and solve the problem of data inconsistency in the data set so as to achieve the purpose of data cleaning;
the data dimensionality reduction is to reduce the dimensionality of a data set by methods such as data compression, data conversion and the like so as to improve the speed of data mining, calculation and analysis;
data normalization scales the data to fall within a small specified interval.
As a further improvement of the technical scheme, the data cleaning method comprises but is not limited to missing value completion and noise value processing; wherein:
completing the missing value, namely filling the missing value by adopting a fixed value substitution method, an interpolation method and a regression method;
in the constant value replacement method, the mean value, the median and the mode are mainly adopted for replacement, and the algorithm is simple and easy to realize;
interpolation methods include but are not limited to linear interpolation, polynomial interpolation and spline interpolation for interpolation, the algorithm is simple, and certain calculated amount is provided;
in the noise value processing, the identification of the noise value is the most important, and the identification method includes, but is not limited to, special value identification, threshold value judgment and an identification method based on a deep learning network.
As a further improvement of the present technical solution, in the interpolation method, an algorithm formula of each interpolation method includes the following:
spline interpolation method:
second-order spline curve: f (x) = ax 2 +bx+c;
Third-order spline curve: f (x) = ax 3 +bx 2 +cx+d;
The polynomial interpolation method utilizes a Lagrange algorithm to calculate the missing value, and the algorithm formula is as follows:
Figure BDA0003797524450000041
after expansion, a polynomial can be obtained:
Figure BDA0003797524450000042
the linear interpolation method mainly comprises a step interpolation (nearest, zero) and a linear interpolation (linear), and the algorithm formula is as follows: f (x) = ax + b.
As a further improvement of the technical solution, the data normalization method includes, but is not limited to, min-max normalization and Z-score normalization, and the calculation formulas are respectively:
min-max normalization:
Figure BDA0003797524450000051
z-score normalization:
Figure BDA0003797524450000052
wherein the content of the first and second substances,
Figure BDA0003797524450000053
as a further improvement of the technical solution, in S2.1.1.4 and S2.1.2.4, in order to avoid misleading that may be generated on a model training result when directly providing data features and data set features extracted from S2.1.1.3 and S2.1.2.3 to a deep learning algorithm for model training or model verification, a boundary before and after a power grid operation mode is changed needs to be identified through analysis of the data features, that is, data needs to be analyzed; analysis methods for boundary identification include, but are not limited to, cluster analysis and similarity analysis; wherein:
clustering analysis, namely clustering data with similar characteristics in a large pile of original data into a class by an algorithm, such as K-mean clustering;
similarity analysis is to calculate the distance between the characteristics of the sample to be evaluated by comparing the characteristics of the sample, such as Euclidean distance algorithm.
As a further improvement of the technical solution, in S3, a specific method for detecting abnormal data by using a long-short term memory model is as follows: the measuring data detection method based on deep learning can predict future data by using a long-term and short-term memory model, takes the predicted data as a reference, judges the deviation of the data acquired at the corresponding moment by using the reference data, and detects the measuring data by combining with a threshold value.
The second objective of the present invention is to provide an operation platform device for a deep learning-based power scheduling abnormal data rapid detection method, which includes a processor, a memory, and a computer program stored in the memory and running on the processor, where the processor is configured to implement the steps of the deep learning-based power scheduling abnormal data rapid detection method when executing the computer program.
It is a further object of the present invention to provide a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the above-mentioned method for fast detecting power scheduling abnormal data based on deep learning.
Compared with the prior art, the invention has the beneficial effects that:
1. in the method for rapidly detecting the abnormal data of the power dispatching based on the deep learning, the deep learning algorithm is adopted, the characteristics of the operation data of the power grid, the parameters of equipment, the operation mode and the like are taken as constraint conditions, the rapid detection of the abnormal data of the power dispatching can be realized, and the detection result is pushed to an automatic operator in an alarm mode;
2. according to the method for rapidly detecting the abnormal data of the power dispatching based on the deep learning, the abnormal data of the power dispatching is rapidly detected, the sensing and rapid processing capacity of the abnormal data can be greatly improved, a regulator can conveniently and visually and comprehensively control the operation data of the power grid, and the problem finding and analysis efficiency of a large number of transformer substation online intelligent inspection work can be supported.
Drawings
FIG. 1 is a flow diagram of an exemplary overall method of the present invention;
FIG. 2 is a functional block diagram of an exemplary method of the present invention;
FIG. 3 is a functional block diagram of an exemplary deep learning based metrology data detection method of the present invention;
FIG. 4 is a graph comparing RNN and LSTM models for long sequence prediction in accordance with an example of the present invention;
FIG. 5 is a block diagram of an exemplary electronic computer platform assembly in accordance with the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
As shown in fig. 1 to 5, the present embodiment provides a method for quickly detecting abnormal data in power dispatching based on deep learning, including the following steps:
s1, after data are acquired, threshold value model processing is carried out;
s2, aiming at abnormal data which cannot be judged by a threshold model but not limited, a measurement data detection method based on deep learning is adopted for detection:
s2.1, establishing a model: constructing a deep learning model with supervised learning, which comprises a telemetering data characteristic model, a telemetering data classification model and a mutual verification process between the telemetering data characteristic model and the telemetering data classification model;
s2.2, data detection: taking the built deep learning model as a core, mainly detecting the telemetering data to find abnormal measurement data in the telemetering data;
s3, constructing a long-short term memory model to detect abnormal data, and solving the problems of gradient loss and gradient explosion in the long time sequence training process;
s4, detecting data rationality based on the CIM model: according to the characteristics of the power grid dispatching service, based on the CIM model, the method pays attention to and detects 7 indexes including but not limited to total station voltage loss, bus voltage abnormity, main transformer input and output unbalance, remote signaling signal abnormity, long-distance reactive power transmission, data non-refreshing and channel interruption.
In this embodiment, in S1, the acquired data mainly refers to telemetry data in "four remote" data that is the most important in power scheduling automation, and the telemetry data is real-time data of power grid operation acquired by using a telemetry function of a telemechanical device in a power system and is transmitted to a power scheduling master station through the telemechanical device, where the real-time data is used as basic data input of a power scheduling automation system to monitor an operation state of the power system.
It should be noted that, because the power system is a continuous and uninterrupted system that generates and uses electricity simultaneously, during the process of supplying electricity, the power grid operation data is continuous data, and the telemetering data collected by the telemechanical device is a time series data.
Meanwhile, the four remote modes comprise remote signaling, remote measuring, remote control and remote regulation, and are mainly realized by the matching of a Remote Terminal Unit (RTU) of a transformer substation or a power plant and a power dispatching master station; wherein:
the remote signaling mainly collects and transmits action information of relay protection of the power system, state information of the circuit breaker and the like; the remote measurement mainly collects and transmits real-time data of the operation of the power system; remote control mainly comprises sending a command for changing the condition of running equipment from a dispatching center; remote regulation is mainly realized by sending commands from a dispatching center to remotely regulate the operating parameters of a power plant or a transformer substation.
In this embodiment, as shown in fig. 3, in S2.1, the specific method for establishing the model includes the following steps:
s2.1.1, establishing a telemetry data characteristic model:
s2.1.1.1 obtaining historical telemetry data;
s2.1.1.2, preprocessing the historical telemetering data;
s2.1.1.3, extracting data characteristics of the preprocessed data;
s2.1.1.4, constructing a telemetering data characteristic model according to the extracted data characteristics;
s2.1.2, establishing a telemetry data classification model:
s2.1.2.1, acquiring telemetering data after service processing;
s2.1.2.2, preprocessing the telemetering data after the service processing;
s2.1.2.3, performing data set characteristic extraction on the preprocessed data;
s2.1.2.4, constructing a telemetry data classification model according to the extracted data set characteristics;
s2.1.3, the telemetry data characteristic model and the telemetry data classification model verify each other:
s2.1.3.1, constructing a deep learning model;
s2.1.3.2, taking a deep learning model as a core, and respectively performing data feature extraction, data set feature extraction and service scene training and verification operations;
s2.1.3.3, the telemetry data feature model, verifies the telemetry data classification model through business scenario training and verification operations.
Further, in S2.2, the specific method for detecting data includes the following steps:
s2.2.1, acquiring real-time telemetering data;
s2.2.2, preprocessing the real-time telemetering data;
s2.2.3, importing the preprocessed real-time telemetering data into a telemetering data characteristic model;
s2.2.4, performing data characteristic analysis;
s2.2.5, performing classification characteristic analysis;
s2.2.6, importing the classification feature analysis result into a telemetry data classification model;
s2.2.7, and finally outputting an abnormal detection result through analysis and judgment of the model.
In this embodiment, in S2.1.1.2, S2.1.2.2 and S2.2.2, the data preprocessing method includes, but is not limited to, data cleaning, data dimension reduction and data normalization; wherein:
the data cleaning is to fill in missing values of data in a data set to be processed, eliminate noise in the data, remove invalid data and solve the problem of data inconsistency in the data set so as to achieve the purpose of data cleaning;
the data dimensionality reduction is to reduce the dimensionality of a data set by methods such as data compression, data conversion and the like so as to improve the speed of data mining, calculation and analysis;
data normalization scales the data to fall within a small specified interval.
Wherein, the data preprocessing is used for: due to the reasons of stability of measuring equipment and systems, human misoperation or vulnerability in the data acquisition process and the like, the collected time series data often has many problems, such as data missing, errors, duplication, redundancy and the like. If the data set is not preprocessed before data analysis, failure of the data analysis task may result, and thus analysis results that do not conform to the actual situation may be obtained. So before the data analysis task begins, the first choice should focus on the detection and correction of data quality problems. Meanwhile, the main problem of the data is that missing values and noise values exist, and the data cleaning process mainly cleans the two conditions.
Further, methods of data cleansing include, but are not limited to, missing value completion and noise value processing; wherein:
completing the missing value, namely filling the missing value by adopting a fixed value substitution method, an interpolation method and a regression method;
in the constant value replacement method, the mean value, the median and the mode are mainly adopted for replacement, and the algorithm is simple and easy to realize;
however, according to the characteristics of periodicity, continuity and the like of the telemetry curve, the constant value substitution method cannot represent missing data through a reasonable fixed value or is only suitable for single missing data, and the application value is low, so that the method is abandoned;
interpolation methods include but are not limited to linear interpolation, polynomial interpolation and spline interpolation for interpolation, the algorithm is simple, and certain calculated amount is provided;
in the noise value processing, the identification of the noise value is the most important, and the identification method includes, but is not limited to, special value identification, threshold value judgment and an identification method based on a deep learning network.
Specifically, in the interpolation method, the algorithm formula of each interpolation method includes the following:
spline interpolation method:
second-order spline curve: f (x) = ax 2 +bx+c;
Third order spline curve: f (x) = ax 3 +bx 2 +cx+d;
The spline interpolation method is suitable for completing missing data values with small missing data quantity, and in the application process, the interpolated data can fluctuate greatly along with the increase of the missing data quantity;
the polynomial interpolation method utilizes a Lagrange algorithm to calculate the missing value, and the algorithm formula is as follows:
Figure BDA0003797524450000101
after expansion, a polynomial is obtained:
Figure BDA0003797524450000102
similarly, as the amount of missing data increases, the interpolated data also fluctuates greatly; compared with a spline interpolation method, the Lagrange interpolation algorithm has much poorer effect;
the linear interpolation method mainly comprises a step interpolation (nearest, zero) and a linear interpolation (linear), and the algorithm formula is as follows: f (x) = ax + b;
when missing values are processed, linear interpolation mainly depends on slope and intercept for fitting, and when the missing data are large, large fluctuation cannot be generated.
In addition, it should be noted that, when data instance operation is performed in the data cleaning link, the method used can be determined according to the data loss, and under the condition of less data loss, spline curve algorithm can be used for interpolation, so that a data set with the minimum error can be obtained. Because the power curve is not familiar with the curve motion trend, under the condition of large data loss, interpolation is suggested to be carried out through a linear interpolation algorithm;
however, when a large area of data is missing, the interpolation method cannot effectively realize reasonable data interpolation, and in this case, the curve model needs to be fitted by deep learning methods such as regression analysis, bayesian, neural network, etc. to obtain the most likely numerical value for filling.
Specifically, in the noise value processing, the three noise value identification methods specifically operate as follows:
special value identification: in the telemetering data, the most special value, namely '0', can represent a normal value 0 and also can represent a numerical value generated by abnormal states of equipment or a telemechanical, the specific identification method can be used for judging by combining other measuring points or remote signaling signals related to the measuring points, and an electric calculation formula is utilized to obtain a theoretical value through calculation or to infer the reasonability of the '0' value through the opening and closing of upstream and downstream remote signaling signals of the measuring points; for example, assuming that the active occurrence of the breaker is a "0" value, a corresponding electrical calculation formula can be used, which is as follows: p = I × U = U 2 ÷R;
Verifying through circuit breaker current measurement and voltage measurement;
meanwhile, the state of the circuit breaker and the disconnecting switch can be verified through checking.
And (3) threshold judgment: the method is mainly used for checking the overlarge value and the overlarge value, for example, the maximum or minimum percentage of a certain value exceeding the corresponding time period of the measurement point in the history can be regarded as the suspected abnormal data.
The identification method based on the deep learning network comprises the following steps: the evaluation model can be more accurately established to check the numerical value, and the judgment result is more accurate.
Further, the reason why the data dimension reduction is required is that: with the attack of big data wave, the problem of overlarge data set is often faced in the data analysis process, and the data mining speed is reduced to a certain extent by the data set of the sea level. Performing high-dimensional computational statistics on a large-scale dataset is catastrophic to computational performance.
Wherein, the data compression is as follows: on the storage level, the data sets are compressed according to minutes, the compression mode is lossy compression, a small amount of information in the original data sets is lost, and the number of the data sets can be greatly reduced. In the calculation level, the data of the measuring points are sorted according to time. A plurality of signal values are provided for a single measurement point, and the signal values are filtered. And then, by combining the service and the big data, most of the remote signaling data can be excluded and not included in the link of correlation analysis, because the remote signaling data reflects the characteristics of the remote signaling data. And in the later stage, the relationship between the switches between the buses and the main transformers and the downstream relationship are considered and taken into consideration for association analysis.
Further, the data normalization method includes, but is not limited to, min-max normalization and Z-score normalization, which are calculated by the following formulas:
min-max normalization:
Figure BDA0003797524450000111
z-score normalization:
Figure BDA0003797524450000112
wherein the content of the first and second substances,
Figure BDA0003797524450000113
however, since the measurement units of the respective quantities of the power operation data are different, in order to map all the features of the measured data to the same scale, the values need to be normalized, and the values are mapped to a certain value interval through function transformation, so that the phenomenon that some features of the data form a dominant role due to different dimensions can be avoided.
Furthermore, the relevance between the power grid dispatching operation data and the power grid operation mode is very high; due to the change of the operation mode of the power grid, the operation state of the electrical equipment is changed, and therefore, the related measurement data is changed; furthermore, in S2.1.1.4 and S2.1.2.4, in order to avoid misleading to a model training result when directly providing data features and data set features extracted from S2.1.1.3 and S2.1.2.3 to a deep learning algorithm for model training or model verification, it is necessary to identify boundaries before and after changing a power grid operation mode by analyzing the data features, that is, to analyze data; analysis methods for boundary identification include, but are not limited to, cluster analysis and similarity analysis; wherein:
clustering analysis, namely clustering data with similar characteristics in a large pile of original data into one class by an algorithm, such as K-mean clustering;
the K-mean clustering algorithm and the principle are as follows:
randomly selecting k centroid points as [ mu ] 12 ,…μ k }∈R n
Figure BDA0003797524450000121
Figure BDA0003797524450000122
The above process is repeated until convergence, and a new centroid point is obtained.
Similarity analysis, namely comparing the characteristics of the sample to be evaluated to calculate the distance between the characteristics of the sample, wherein if the distance is small, the similarity is large; if the distance is large, the similarity is small.
The most common distance algorithm for calculating the similarity is as follows: the Euclidean distance, the Manhattan distance, the vector cosine included angle and the Pearson correlation coefficient are selected as a good calculation method according to the characteristic that real-time power operation data are dense and continuous, and the algorithm and the principle are as follows:
distance between two points a (x 1, y 1) and b (x 2, y 2) on the two-dimensional plane:
Figure BDA0003797524450000123
the distance between two points a (x 1, y1, z 1) and b (x 2, y2, z 2) in three-dimensional space:
Figure BDA0003797524450000124
distance between two points a (x 11, x12, …, x1 n) and b (x 21, x22, …, x2 n) in n-dimensional space:
Figure BDA0003797524450000125
Figure BDA0003797524450000126
through comparative analysis, the following results can be obtained: clustering analysis and similarity analysis essentially calculate the distance between two vectors. For the real-time power operation data, the data change generated after the operation mode is converted can reflect the distance between two vectors of the data characteristics before and after the operation mode is converted, and the boundary is identified through the distance.
Furthermore, after the power grid operation data are received, calculation can be performed according to indexes in three aspects of concentration trend, dispersion degree and distribution form respectively, statistical characteristics of the data are formed, operation data dimensionality can be effectively reduced by using the statistical characteristics, and the operation speed of data analysis is improved. And clustering analysis and similarity analysis are carried out on the basis of the characteristic data, so that nodes with changed operation modes can be effectively identified. The nodes can be used as the boundary of the operation data, and the boundary data is avoided when the model is trained. And the data is cut according to the boundary, and the cut data is applied so as to obtain a more accurate data model.
For historical data, data is supplemented by an interpolation method due to data loss caused by acquisition, network and other reasons, and a spline curve interpolation method is suggested to be adopted for data supplementation after a test and a small amount of data loss; more data are missing, a linear interpolation method is suggested to be adopted for data completion; for large area data misses, it is recommended to filter the day data.
In this embodiment, in S3, a specific method for detecting abnormal data through the long-term and short-term memory model is as follows: the measuring data detection method based on deep learning can predict future data by using a long-term and short-term memory model, takes the predicted data as a reference, judges the deviation of the data acquired at the corresponding moment by using the reference data, and detects the measuring data by combining with a threshold value. By applying the method, massive telemetry data such as tens of thousands of telemetry data can be detected in real time through a computer, abnormal data can be found in time, and the problem that manual finding is not timely is effectively solved.
Among them, the Long-short term memory model (LSTM) is a special RNN. Briefly, LSTM introduces the concept of "gates" that can be selectively "forgotten" certain inputs and "mask" certain outputs from affecting the next level of weight update, by controlling them, and can perform better in a longer sequence than normal RNNs. As shown in FIG. 4, which is a comparison between RNN and LSTM models in long sequence prediction, LSTM effectively solves the problem of RNN gradient disappearance in terms of long sequence model training and prediction.
Furthermore, by utilizing the preprocessed data in the normal operation mode obtained after the data characteristic analysis, 90% of the data is used for model training, and 10% of the data is used as the verification data, so that the loss of the trained model is verified, and the effect of the model is verified. The training process is consistent with RNN, and is not described herein.
In this embodiment, in S4, besides the abnormal measurement, the rationality of the power grid operation data is also important to pay attention. When data rationality detection based on the CIM model is carried out, specific items comprise the following items:
(1) Total station decompression: according to the characteristics of the total-station voltage loss, a user-defined total-station voltage loss model can be established through the CIM, and the model is used for detecting data. According to the definition, only the measurement points in the established model need to be detected, and if the continuous measurement points are simultaneously the value of 0, the total station voltage loss can be judged.
The total station voltage loss condition generally includes a passive mode and an active mode, wherein the passive mode belongs to an accident or a critical equipment fault in the station, and the active mode belongs to a power failure in the total station caused by equipment maintenance arrangement in the station. When the maintenance work is carried out in the station, the cards can be hung in the system according to the requirement under the condition of active power failure, and the 0 value phenomenon can occur after each measuring point is blocked after the cards are hung, so that the cards can be distinguished by detecting the card hanging signals;
(2) Abnormal bus voltage: the abnormal bus voltage mainly means that the actual voltage value of the bus exceeds the rated voltage, the deviation range of the bus voltage and the rated voltage thereof is-10% -to +7% according to the regulation of a dispatching regulation, and whether a measurement point exceeds a specified threshold value or not can be detected in real time. The excess value is recorded and the voltage yield is calculated periodically. Wherein, the voltage qualification rate formula is as follows:
Figure BDA0003797524450000141
(3) The input and output of the main transformer are unbalanced: the unbalanced input and output of the main transformer means that the two sides of the power supply side and the load side of the main transformer are unbalanced, namely power generation and power utilization are unbalanced. According to the characteristic of unbalanced input and output of the main transformer, a user-defined detection model can be established through the CIM, and the data is detected by using the model.
The main transformer generally has three windings, which are divided into a high voltage side, a medium voltage side and a low side according to voltage grades, wherein the high voltage is generally the power supply side, i.e. the input side, and the active power thereof is denoted as P i The medium-voltage side and the low-voltage side are the load side, i.e. the output side, the active power of which is denoted as P o1 ,P o2 . According to the principle, the following formula holds: p i =P o1 =P o2
Since in the power system the current is directional, but represented by a sign (±) in the scheduling automation system, it is possible to obtain: p i +P o1 +P o2 =0。
Theoretically, the sum of the active output values of all sides of the main transformer is considered to be normal if the sum is equal to 0, but due to the error of the collector or the transformer equipment, the sum of the input and output values is not equal to 0 in practice. Therefore, to detect the input/output data, an allowable offset value needs to be set. Regarding the deviation value, all types of main transformers are inconsistent and cannot be determined empirically, and therefore the idea of big data analysis is utilized for processing. If 1 month data of each measuring point (about 5 ten thousand samples of each measuring point) is selected for observation, error data is normalized, and the following can be obtained: the interval in which the dots are the most dense may be set as an allowable deviation range. In addition, the model fitting can be performed by using the most dense region data, and the allowable deviation range can be calculated by using the model, and the calculation result is used as a threshold value.
(4) Remote signaling signal abnormity: the remote signaling signal abnormality mainly refers to the problem that the switching state in the OCS system is inconsistent with the actual field. In the stable operation process of a power grid, after a remote signaling signal is abnormal, a monitoring system or a dispatching master station cannot correctly receive the remote signaling information of a station, and therefore correct judgment of information such as a circuit breaker, an isolating switch and protection actions can be influenced. If the quantity state of the received remote signaling signal is on under normal conditions, the active value of a measuring point related to the remote signaling signal is infinitely close to 0.
According to the principle, a user-defined detection model can be established through CIM, mainly aiming at circuit breaker modeling, nearest isolating switches in front of and behind the circuit breaker are searched through topology, and data are detected by utilizing the model.
After the model establishes the relationship between the remote signaling signal and the related measurement, the received operation data can be directly detected in real time through the model. Also due to the problem of the collector, when the switch is pulled open, the corresponding telemetric value is a value infinitely approaching 0, which is a very small value compared with the normal active power, and the threshold value can be set through experience. Of course, as the number of data sets collected increases, the number of data samples that are left open can be determined by the method described above.
(5) And (3) long-distance reactive power transmission: the long-distance reactive power transmission means that when the active power of the outgoing line switch at the end is 0, the reactive power data is large. Long distance reactive power transmission is not allowed in power transmission because reactive power is simply an exchange of energy between electrical components, and a large amount of energy is lost too far away. Therefore, the power grid does not transmit reactive power from a long distance, but adopts the principle of local compensation.
According to the definition of the long-distance reactive power transmission, active and reactive power measurement points corresponding to all outgoing switches can be obtained through the CIM, a detection model is established, and data are detected by the model.
After the model establishes the relation of the P, Q measuring points corresponding to each outlet switch, the received operation data can be detected in real time directly through the model.
(6) Data not refreshed: the data is not refreshed, which mainly refers to a certain measuring point, and no change occurs within a certain time period. Under the condition that a transformer substation normally operates, the substation is provided with a plurality of data transmission channels, when one channel is abnormal or measuring equipment is abnormal, the channel can directly reflect on a measured value layer, so that a measured value is continuously unchanged, and the measured value in the time period can be monitored to judge that the measured data is abnormal without refreshing. According to the business requirement, the period can be set to 10 minutes, namely whether the measuring point changes within 10 minutes or not is automatically monitored, the data of the measured point is continuously unchanged within ten minutes, and the measured point is subjected to message alarm to remind an operator on duty to perform abnormal investigation on the measuring equipment or the associated channel.
(7) Channel interruption: the channel interruption mainly refers to that all data transmission channels in the transformer substation are interrupted, so that the data in the whole substation are not refreshed. The abnormal index can be performed on the basis of the index with no data refreshing, namely, the data of all measuring points in the station are not refreshed, the channel interruption can be judged, and an operator on duty needs to be reminded to perform abnormal troubleshooting on the measuring equipment or the associated channel.
As shown in fig. 5, the embodiment further provides an operating platform device of a deep learning-based power scheduling abnormal data rapid detection method, where the operating platform device includes a processor, a memory, and a computer program stored in the memory and running on the processor.
The processor comprises one or more processing cores, the processor is connected with the memory through the bus, the memory is used for storing program instructions, and the steps of the power scheduling abnormal data rapid detection method based on deep learning are realized when the processor executes the program instructions in the memory.
Alternatively, the memory may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
In addition, the invention also provides a computer readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the steps of the above method for quickly detecting power scheduling abnormal data based on deep learning are realized.
Optionally, the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the steps of the above-mentioned aspects of the fast detection method for power scheduling abnormal data based on deep learning.
It will be understood by those skilled in the art that the processes for implementing all or part of the steps of the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, and the program may be stored in a computer readable storage medium, where the above mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing shows and describes the general principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. The method for quickly detecting the abnormal data of the power dispatching based on the deep learning is characterized by comprising the following steps: the method comprises the following steps:
s1, after data are acquired, threshold value model processing is carried out;
s2, aiming at abnormal data which cannot be judged by a threshold model but not limited, a measurement data detection method based on deep learning is adopted for detection:
s2.1, establishing a model: constructing a deep learning model with supervised learning, which comprises a telemetering data characteristic model, a telemetering data classification model and a mutual verification process between the telemetering data characteristic model and the telemetering data classification model;
s2.2, data detection: taking the built deep learning model as a core, mainly detecting the telemetering data to find abnormal measurement data;
s3, constructing a long-short term memory model to detect abnormal data, and solving the problems of gradient disappearance and gradient explosion in the long time sequence training process;
s4, detecting data rationality based on the CIM model: according to the characteristics of the power grid dispatching service, based on the CIM model, the method pays attention to and detects 7 indexes including but not limited to total station voltage loss, bus voltage abnormity, main transformer input and output unbalance, remote signaling signal abnormity, long-distance reactive power transmission, data non-refreshing and channel interruption.
2. The method for rapidly detecting the abnormal data of the power dispatching based on the deep learning as claimed in claim 1, wherein: in the step S1, the acquired data mainly refers to telemetry data in "four remote" data which is the most important data in power scheduling automation, and the telemetry data is real-time data of power grid operation acquired by using a telemetry function of a telemechanical device in a power system and is transmitted to a power scheduling master station through the telemechanical device, and is used as basic data input of a power scheduling automation system to monitor the operation state of the power system.
3. The method for rapidly detecting the abnormal data of the power dispatching based on the deep learning as claimed in claim 1, wherein: in S2.1, the specific method for model establishment includes the following steps:
s2.1.1, establishing a telemetering data characteristic model:
s2.1.1.1, acquiring historical telemetering data;
s2.1.1.2, preprocessing the historical telemetering data;
s2.1.1.3, extracting data characteristics of the preprocessed data;
s2.1.1.4, constructing a telemetering data characteristic model according to the extracted data characteristics;
s2.1.2, establishing a telemetry data classification model:
s2.1.2.1, obtaining telemetering data after service processing;
s2.1.2.2, preprocessing the telemetering data after the service processing;
s2.1.2.3, performing data set characteristic extraction on the preprocessed data;
s2.1.2.4, constructing a telemetering data classification model according to the extracted data set characteristics;
s2.1.3, the telemetry data characteristic model and the telemetry data classification model verify each other:
s2.1.3.1, constructing a deep learning model;
s2.1.3.2, taking the deep learning model as the core, and respectively performing data feature extraction, data set feature extraction, service scene training and verification operation;
s2.1.3.3, the telemetry data feature model, verifies the telemetry data classification model through business scenario training and verification operations.
4. The deep learning-based power scheduling abnormal data rapid detection method according to claim 3, characterized in that: in S2.2, the specific method for data detection includes the following steps:
s2.2.1, acquiring real-time telemetering data;
s2.2.2, preprocessing the real-time telemetering data;
s2.2.3, importing the preprocessed real-time telemetering data into a telemetering data characteristic model;
s2.2.4, performing data characteristic analysis;
s2.2.5, performing classification characteristic analysis;
s2.2.6, importing the classification feature analysis result into a telemetry data classification model;
s2.2.7, and finally outputting an abnormal detection result through analysis and judgment of the model.
5. The deep learning-based power scheduling anomaly data rapid detection method according to claim 4, characterized in that: in S2.1.1.2, S2.1.2.2 and S2.2.2, the method for preprocessing data includes but is not limited to data cleaning, data dimension reduction and data normalization; wherein:
the data cleaning is to fill in missing values of data in a data set to be processed, eliminate noise in the data, remove invalid data and solve the problem of data inconsistency in the data set so as to achieve the purpose of data cleaning;
the data dimensionality reduction is to reduce the dimensionality of a data set by methods such as data compression, data conversion and the like so as to improve the speed of data mining, calculation and analysis;
data normalization scales the data to fall within a small specified interval.
6. The deep learning-based power scheduling abnormal data rapid detection method according to claim 5, characterized in that: the data cleaning method comprises but is not limited to missing value completion and noise value processing; wherein:
completing missing values, namely filling the missing values by using a fixed value replacement method, an interpolation method and a regression method;
in the fixed value replacement method, the mean value, the median and the mode are mainly adopted for replacement;
interpolation methods include, but are not limited to, linear interpolation, polynomial interpolation, spline interpolation;
in the noise value processing, the identification of the noise value is the most important, and the identification method includes, but is not limited to, special value identification, threshold value judgment and an identification method based on a deep learning network.
7. The deep learning-based power scheduling anomaly data rapid detection method according to claim 6, characterized in that: in the interpolation method, the algorithm formula of each interpolation method comprises the following steps:
spline interpolation method:
second-order spline curve: f (x) = ax 2 +bx+c;
Third-order spline curve: f (x) = ax 3 +bx 2 +cx+d;
The polynomial interpolation method utilizes a Lagrange algorithm to calculate the missing value, and the algorithm formula is as follows:
Figure FDA0003797524440000031
after expansion, a polynomial is obtained:
Figure FDA0003797524440000032
the linear interpolation method mainly comprises step interpolation and linear interpolation, and the algorithm formula is as follows: f (x) = ax + b.
8. The deep learning-based power scheduling anomaly data rapid detection method according to claim 5, characterized in that: the data normalization method includes, but is not limited to, min-max normalization and Z-score normalization, which are respectively calculated by the following formula:
min-max normalization:
Figure FDA0003797524440000041
z-score normalization:
Figure FDA0003797524440000042
wherein the content of the first and second substances,
Figure FDA0003797524440000043
9. the deep learning-based power scheduling anomaly data rapid detection method according to claim 3, characterized in that: in the S2.1.1.4 and the S2.1.2.4, in order to avoid misleading to a model training result when data features and data set features extracted from the S2.1.1.3 and the S2.1.2.3 are directly provided to a deep learning algorithm for model training or model verification, a boundary before and after a power grid operation mode is changed needs to be identified through analysis of the data features, that is, the data needs to be analyzed; analysis methods for boundary identification include, but are not limited to, cluster analysis and similarity analysis; wherein:
clustering analysis, namely clustering data with similar characteristics in a large pile of original data into a class through an algorithm;
similarity analysis is to calculate the distance between the characteristics of the sample to be evaluated by comparing the characteristics of the sample.
10. The method for rapidly detecting the abnormal data of the power dispatching based on the deep learning as claimed in claim 1, wherein: in S3, the specific method for detecting abnormal data through the long-term and short-term memory model is as follows: the measuring data detection method based on deep learning can predict future data by using a long-term and short-term memory model, takes the predicted data as a reference, judges the deviation of the data acquired at the corresponding moment by using the reference data, and detects the measuring data by combining with a threshold value.
CN202210972874.8A 2022-08-15 2022-08-15 Power dispatching abnormal data rapid detection method based on deep learning Pending CN115357462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210972874.8A CN115357462A (en) 2022-08-15 2022-08-15 Power dispatching abnormal data rapid detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210972874.8A CN115357462A (en) 2022-08-15 2022-08-15 Power dispatching abnormal data rapid detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN115357462A true CN115357462A (en) 2022-11-18

Family

ID=84033632

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210972874.8A Pending CN115357462A (en) 2022-08-15 2022-08-15 Power dispatching abnormal data rapid detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN115357462A (en)

Similar Documents

Publication Publication Date Title
CN109146093B (en) Power equipment field investigation method based on learning
Haque et al. Application of neural networks in power systems; a review
CN108931972B (en) A kind of substation secondary device condition intelligent diagnostic method based on model-driven
JP6301791B2 (en) Distribution network failure sign diagnosis system and method
CN112713649B (en) Power equipment residual life prediction method based on extreme learning machine
CN104615122B (en) A kind of industry control signal detection system and detection method
CN116125361B (en) Voltage transformer error evaluation method, system, electronic equipment and storage medium
CN108418304B (en) Transformer substation secondary circuit state monitoring method, device and system
CN115048591A (en) Power distribution network holographic data visualization intelligent display analysis system based on artificial intelligence
EP3968479A1 (en) Systems and methods for automatic power topology discovery
CN113763667A (en) Fire early warning and state monitoring device and method based on 5G edge calculation
CN110209144A (en) Two layers of real-time monitoring and alarm source tracing method based on sound collaboration variance analysis
CN104834305B (en) Distribution automation terminal remote measurement exception analysis system and method based on DMS systems
CN115372816A (en) Power distribution switchgear operation fault prediction system and method based on data analysis
CN117689214A (en) Dynamic safety assessment method for energy router of flexible direct-current traction power supply system
CN107478988A (en) Breaker anomalous discrimination method and system based on non-precision Bayesian model
CN115357462A (en) Power dispatching abnormal data rapid detection method based on deep learning
CN114265837A (en) Station-side interactive data feature extraction method
CN113746073A (en) Main station and terminal cooperative self-adaptive power distribution network fault processing method and system
Deng et al. Study on online dispatching defensive strategy for power grid considering expected circuit breaker fault set
CN117791597B (en) Power distribution network fault self-healing method and system based on machine learning
Zijian et al. Fault prediction of distribution terminal equipment based on entropy weight vague matter-element under the digital twin framework
Vázquez et al. Fault detection in low voltage networks with smart meters and machine learning techniques
Fei et al. Research on Intelligent Diagnosis Method of Power Grid Fault Components Based on Fault Fingerprint Technology
KR102604708B1 (en) Switchboard diagnosis system based on artificial intelligence and switchboard diagnosis method based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination