CN114528755A - Power equipment fault detection model based on attention mechanism combined with GRU - Google Patents

Power equipment fault detection model based on attention mechanism combined with GRU Download PDF

Info

Publication number
CN114528755A
CN114528755A CN202210084475.8A CN202210084475A CN114528755A CN 114528755 A CN114528755 A CN 114528755A CN 202210084475 A CN202210084475 A CN 202210084475A CN 114528755 A CN114528755 A CN 114528755A
Authority
CN
China
Prior art keywords
sample
power equipment
samples
data
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210084475.8A
Other languages
Chinese (zh)
Inventor
张晓华
吕志瑞
武宇平
黄彬
孙云生
杨静宇
卢毅
马鑫晟
张连超
李世杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kedong Electric Power Control System Co Ltd
State Grid Jibei Electric Power Co Ltd
Electric Power Research Institute of State Grid Jibei Electric Power Co Ltd
Original Assignee
Beijing Kedong Electric Power Control System Co Ltd
State Grid Jibei Electric Power Co Ltd
Electric Power Research Institute of State Grid Jibei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kedong Electric Power Control System Co Ltd, State Grid Jibei Electric Power Co Ltd, Electric Power Research Institute of State Grid Jibei Electric Power Co Ltd filed Critical Beijing Kedong Electric Power Control System Co Ltd
Priority to CN202210084475.8A priority Critical patent/CN114528755A/en
Publication of CN114528755A publication Critical patent/CN114528755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/04Power grid distribution networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Resources & Organizations (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an attention mechanism combined GRU-based power equipment fault detection model, which comprises a classification neural network model, wherein training data of the classification neural network model is derived from a preprocessing model, the preprocessing model converts input unbalanced power equipment data into balanced data and carries out embedded representation, and intermediate data are output: based on the historical state sequence of the power equipment representation, the embedded representation of the tag data and the embedded representation of the power equipment portrait characteristic; extracting time and space characteristics of the power equipment from the historical state sequence through the GRU module; extracting state sequence features from the output of the GRU module by an attention mechanism module; extracting, by a graph attention mechanism module, environmental information of the power device from the embedded representation of the power device portrait feature; and performing alignment fusion on the state sequence characteristics, the label data embedded representation and the environment information to serve as training data input of the classification neural network.

Description

Power equipment fault detection model based on attention mechanism combined with GRU
Technical Field
The invention belongs to the technical field of power equipment fault detection, relates to a power grid fault detection method, and particularly relates to a power equipment fault detection model based on an attention mechanism combined with GRUs.
Background
In the era of rapid development of science and technology and continuous optimization of economic structure, power problems face significant challenges. With the increase of the number of power consumers and enterprises, especially in areas with large industrial development, the requirement for power supply is higher, and when power supply equipment failure occurs in these areas, industrial equipment can be prevented from operating for a long time, and a series of serious consequences can be caused, so that automatic failure detection of power equipment plays an important role in a power supply system. The existing fault detection method based on the traditional power equipment is complex and inefficient, and meanwhile, the existing model cannot make full use of unbalanced power fault data, so that the actual value of the model is greatly reduced.
Therefore, how to provide a training model and a method for a power equipment fault detection model to achieve better performance of optimization, recognition and classification tasks and improve the accuracy of model detection is a technical problem to be solved urgently by those skilled in the art.
Disclosure of Invention
In view of the defects of the prior art, the invention aims to provide a power equipment fault detection model based on an attention mechanism set GRU, which is reasonable in design, sufficient in data utilization and high in detection accuracy.
In order to achieve the above object, the present invention provides an attention mechanism-based power equipment failure detection model combining GRU, the power equipment failure detection model comprising a classification neural network model whose training data is derived from a preprocessing model, the preprocessing model comprising an up-sampling module, a word embedding representation learning module, a GRU module, an attention mechanism module and a graph attention mechanism module,
the up-sampling module is used for converting input unbalanced power equipment data into balanced data;
the word embedding representation learning module is used for embedding and representing the balance data and outputting an embedded representation based on a historical state sequence represented by the power equipment, a tag data embedding representation and a power equipment portrait characteristic;
the GRU module is used for extracting time and space characteristics of the electric power equipment from a history state sequence which is output by the word embedding representation learning module and is based on the electric power equipment representation;
the attention mechanism module is used for extracting state sequence features from the time and space features of the equipment;
the graph attention mechanism module is used for extracting environment information of the power equipment from the embedded representation of the power equipment portrait characteristics;
and performing alignment fusion on the state sequence features, the label data embedded representation and the environment information to serve as training data input of the classification neural network.
Further, the up-sampling algorithm adopted by the up-sampling module is an SC-SMOTE up-sampling algorithm.
Further, the SC-SMOTE upsampling algorithm specifically includes:
step 21: traversing input power equipment data set data, and determining a majority seed sample and a minority seed sample;
step 22: according to the seed sample information, simultaneously performing upsampling on the majority class and the minority class, and calculating the number of samples generated by each minority type of seed samples;
step 23: after the number of samples generated by each few types of seed samples is obtained, carrying out linear interpolation to obtain a final new sample, and combining the newly generated sample and the original seed sample together to generate a balanced sample data set;
step 24: and carrying out embedded representation on the data in the generated balance sample data set.
Further, the step 21 includes: traversing the power equipment data set data, and determining a neighbor sample set D of the sample x by using a KNN algorithmnIn neighbor set DnIn (2), samples of the same class as sample x are set DsameThe set of samples of the different class from sample x is called Dother(ii) a Comparison DsameNumber of samples and DotherAccording to the formula:
Figure BDA0003486959580000021
and judging whether the sample x is a seed sample, and adding a seed sample label S to the original data set.
Further, the calculation of the number of generated samples in step 22 includes the following formula:
label_diffj=Nmaj-Nj
Figure BDA0003486959580000022
Figure BDA0003486959580000023
wherein, label _ diffjRepresenting majority and minority classes C in the original datasetjThe difference in the number of samples of (1); n is a radical ofjIndicates belonging to class CjThe number of samples of (a); ds_jIndicates belonging to class CjThe set of seed samples of (a); rs (Rs)jRepresenting majority seed samples and class CjThe proportion of the seed sample of (a); n is a radical ofgjRepresents each of the categories CjThe number of new samples generated by averaging the seed samples; label _ diffj/|Ds_jI represents class CjThe number of samples that need to be generated in order to balance the difference in the number of raw data.
Further, the step 23 includes the steps of: after the number of samples generated by the seed samples is obtained, updating cluster center coordinates while iteratively dividing the samples each time by adopting a K-means algorithm according to the Euclidean distance between the cluster center and the sampled samples; wherein the hyper-parameter K of the K-means algorithmcRepresenting the number of clusters of classes, the hyperparameter kcThe value of (d) depends on the ratio of the number of majority classes to the number of minority classes in the dataset, and is formulated as:
Figure BDA0003486959580000031
after clustering the data set by adopting a K-means algorithm, marking a cluster label C of the cluster for each sample, and updating the data set as follows:
Figure BDA0003486959580000032
further, the data processing method for the newly generated sample in step 23 includes: screening out samples of the same category in each category cluster to form a sample set DcEach sample contains a feature set F ═ F1,f2,…,fpAnd then according to different feature types, executing: when the characteristics are discrete characteristics, field selection is carried out according to probability distribution of different fields in the data generation process; when the characteristic is a continuous characteristic, in the data generation process, the [ min, max ] of the characteristic value is taken]And randomly selecting data in the interval as a generated value, wherein max and min are respectively the maximum value and the minimum value of the characteristic value.
Further, the method for obtaining the final new sample by linear interpolation in step 23 is as follows:
for each seed sample xiCorresponding category yiClass c ofiThere is a new number of samples N that need to be generatedgiEach time a new sample is generated, according to NgiAnd distribution FD [ c ] of each feature of the cluster in which it is locatedi][yi]First, auxiliary sample x is generatedtempThen linear interpolation is carried out to obtain the final generated sample xnew
Wherein the auxiliary sample xtempThe construction of (c) needs to satisfy three rules:
temporary sample xtempAnd sample xiLabels y belonging to the same categoryi
Temporary sample xtempAnd sample xiBelong to the same cluster ci
Temporary sample xtempAnd sample xiHaving the same characteristics, but of eachThe characteristic value is based on the class ciCharacteristic distribution of (F c)i][yi]Obtaining by random sampling;
obtaining a temporary sample xtempThen, a new sample x is obtained by means of linear interpolationnew
xtemp=[f1,f2,…,fp],fp=Random(FD[ci][yi][p])
Xnew=x+Random(0,1)×(xtemp-x)
Cycling seed samples for NgjAfter the secondary sample generation operation, obtaining a group of generated samples based on the seed sample, wherein the generated samples and the seed sample belong to the same category; after each seed sample finishes the sample generation, the obtained generated sample set DgMerging with the original data set D to obtain the final required balance data set Dbalance
Further, the data output result of the classification neural network model is represented as:
Figure BDA0003486959580000041
wherein, ypred∈{0,1,2},ObIs a vector representation of the state sequence features; e'pIs a state feature vector representation of the power equipment; e.g. of the typeaFor vector representation of predicted target, Wdeep,bdeepAre output layer parameters.
Further, the vector of the state sequence features represents ObThe GRU module and the attention mechanism module are used for calculation;
the data processing procedure and result of the GRU module are expressed as:
rt=σ(Writ+Urht-1+br)
zt=σ(Wzit+Uzht-1+bz)
Figure BDA0003486959580000042
Figure BDA0003486959580000043
wherein σ denotes a Sigmoid function, which indicates a hadamard product,
Figure BDA0003486959580000044
wherein n ishidRepresenting GRU network hidden layer size, k representing embedded vector size, itRepresenting the input of the GRU, representing the t-th vector representation in the history state sequence, i.e. it=eb[t](ii) a The output value h of the GRU moduletThe t hidden state is represented and is a potential expression form of the past state of the power equipment;
the data processing procedure and the result of the attention mechanism module are expressed as follows:
Figure BDA0003486959580000045
Figure BDA0003486959580000046
wherein, atAn attention score representing an attention distribution calculation of the attention mechanism module; f [.]Representing the attention scoring function.
Further, the state feature vector of the power device represents e'pObtained through calculation of the graph attention machine module; the data processing results of the graph attention mechanism module are expressed as:
e′p=[ep,e′p1,e′p2,…,e′pn,X]
wherein e ispRepresenting an embedded vector of features for the power device; e'p1,e′p2,…,e′pnFor embedding the vector as epThe embedded vector of the relevant power device node of the power device of (1);
Figure BDA0003486959580000051
x is a coefficient set.
The invention realizes the following beneficial effects:
1. the present invention uses an attention-based mechanism in conjunction with a timing feature capture technique of the GRU. Aiming at the problem that time sequence feature information of input data cannot be fully utilized, the invention uses GRU in combination with an attention mechanism to improve feature extraction capability of input samples on time sequences and space structures, further captures more equipment state features, and then inputs the obtained power equipment features into a neural network to predict the fault condition of power equipment. This effectively solves the problem of inefficient utilization of the input power equipment sample information on the one hand. The invention is different from the traditional power equipment fault detection method in that the GRU network is used for simultaneously extracting the characteristic information of the input sample data on time and space, and an attention mechanism is used for sensing the state sequence characteristics of the input sample. By carrying out multi-dimensional capture on input sample characteristics, high-quality characteristic input is provided for classification of a downstream neural network model, and finally the model can better identify and detect the state of input power grid equipment.
2. When the method and the device are used for extracting the characteristics of the single power equipment node, the influence of the information of the power equipment around the single node on the single node is also considered besides the state characteristic time sequence information and the space information of the single node. According to the method, the surrounding related equipment information of the single equipment is captured by using an image attention mechanism, namely, the environment information is fused, so that the characteristic dimension of the single equipment is improved, more characteristic information is provided for the classification task of a downstream neural network, and the detection capability of the model on the fault equipment is improved.
3. The invention adopts the data up-sampling technology based on SC-SMOTE, which can well solve the problem caused by unbalanced samples and can effectively relieve the unbalanced condition of the power equipment samples.
4. According to the invention, the SC-SMOTE and the GRU network based on the attention mechanism are adopted, and the automatic accurate detection capability of the power equipment fault in the power grid power supply scene is realized by multi-dimensional extraction of the input power equipment sample characteristics.
Drawings
FIG. 1 is a schematic diagram of the SC-SMOTE based upsampling technique of the present invention to obtain balanced sample data;
FIG. 2 is a schematic flow chart of the process of feature acquisition in time and space for input samples based on an attention mechanism in combination with GRUs according to the present invention;
FIG. 3 is a diagram of a related information capturing network framework for obtaining environmental assistance information of a current device node based on a graph attention force mechanism according to the present invention;
fig. 4 is an overall frame diagram of the present invention.
Detailed Description
To further illustrate the various embodiments, the invention provides the accompanying drawings. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the embodiments. Those skilled in the art will appreciate still other possible embodiments and advantages of the present invention with reference to these figures.
The invention will now be further described with reference to the accompanying drawings and detailed description.
The invention provides a power equipment fault detection model based on an attention mechanism combined with a GRU. The power equipment fault detection model comprises a classification neural network model, training data of the classification neural network model is derived from a preprocessing model, the preprocessing model comprises an up-sampling module, a word embedding representation learning module, a GRU module, an attention mechanism module and a graph attention mechanism module, and the up-sampling module is used for converting input unbalanced power equipment data into balanced data; the word embedding representation learning module is used for embedding and representing the balance data and outputting a historical state sequence based on the power equipment representation, a label data embedding representation, and a state (power equipment portrait characteristic) embedding representation of the power equipment and the corresponding related equipment; the GRU module is used for extracting time and space characteristics of the electric power equipment from a history state sequence which is output by the word embedding representation learning module and is based on the electric power equipment representation; the attention mechanism module is used for extracting state sequence features from the time and space features of the equipment; the graph attention mechanism module is used for extracting environment information of the power equipment from the embedded representation of the power equipment portrait characteristics; and performing alignment fusion on the state sequence features, the label data embedded representation and the environment information to serve as training data input of the classification neural network.
Specifically, the construction and training of the power equipment fault detection model based on the attention mechanism and the GRU comprise the following steps:
step 1, inputting a power equipment data set, wherein the power equipment data set comprises a power equipment entity state and a corresponding label; wherein, power equipment entity state includes the gaseous composition content condition in the transformer oil, the transformer partial discharge condition, equipment contact surface temperature condition, various state data such as the condition that the internal element wets, and power equipment's label is power equipment like transformer fault type, includes: insulation deterioration, abnormal vibration, and the like.
Step 2, converting input unbalanced power equipment data into balanced data by utilizing an up-sampling algorithm such as SC-SMOTE (single carrier-synchronous transfer mode), and then carrying out embedded representation (Embedding);
the specific steps of the step 2 comprise:
taking the entity state and the label of the electric power equipment contained in the data set of the electric power equipment in the step 1 as the input of the SC-SMOTE up-sampling algorithm, taking the input as seed data, generating a new sample according to the corresponding characteristic distribution matrix, then obtaining a final new sample through linear interpolation, merging the newly generated sample and the original seed sample together to generate a balanced sample data set, and then embedding and representing the data:
(1) obtaining a power equipment data set, wherein the power equipment data set comprises equipment entity states and corresponding labels;
(2) to power equipmentThe dataset data is traversed and a Neighbor sample set D of samples x is determined using a KNN algorithm (K-Nearest Neighbor, Neighbor algorithm)nIn neighbor set DnIn (2), there are samples of different classes, and samples of the same class as sample x are set DsameThe set of samples of the different class from sample x is called Dother. Comparison DsameNumber of samples and DotherAccording to formula 2.1, whether the sample x is a seed sample is judged, and a seed sample label S is added to the original data set.
Figure BDA0003486959580000071
(3) And according to the seed sample information, the majority class and the minority class are simultaneously subjected to upsampling. For most classes in the seed samples, the sampling rate is 100%, i.e. a majority class seed sample generates a new sample. The sampling rate of the few types of seed samples is determined according to the sample proportion of the original data and the sample proportion of the seed samples.
To compensate for the sample difference between the original data set, the sample number difference between the majority class and the minority class in the original data set, label _ diff, needs to be calculatedj
label_diffj=Nmaj-Nj (2.2)
Wherein, NmajRepresenting the number of samples in a plurality of classes; n is a radical ofjIndicates belonging to class CjThe number of samples.
In the seed sample set, the majority type of seeds is more than the minority type of seeds, and the sampling rate of the majority type of seeds is 100%. In order to compensate the quantity difference of the seed samples, the proportion R of the majority type seeds and the minority type seeds needs to be calculatedsj
Figure BDA0003486959580000072
Wherein D iss_majA set of seed samples representing a plurality of classes; ds_jIndicates belonging to a categoryCjThe set of seed samples.
Calculate each class CjIs generated by averaging the seed samples to generate a new number of samples Ngj
Figure BDA0003486959580000073
Wherein R issjThe number of samples that each seed sample needs to generate in order to balance the difference in the number of seed samples is shown. label _ diffj/|Ds_jAnd | represents the number of samples that each seed sample needs to generate in order to balance the difference in the number of raw data.
(4) After the number of samples generated by each few types of seed samples is obtained, updating the cluster center coordinates while iteratively dividing the samples each time by adopting a K-means algorithm (K-means clustering algorithm) according to the Euclidean distance between the cluster center and the sampled samples. Wherein the hyper-parameter K of the K-means algorithmcRepresenting the number of clusters of classes, in the SC-SMOTE algorithm, the hyperparameter kcThe value of (d) depends on the ratio of the number of majority classes to minority classes in the dataset, expressed as:
Figure BDA0003486959580000074
clustering the data set according to a general K-means algorithm, marking a cluster label C of the cluster for each sample, and updating the data set as follows:
Figure BDA0003486959580000081
(5) screening out samples of the same category in each category cluster to form a sample set DcEach sample contains a feature set F ═ F1,f2,…,fpAnd performing corresponding processing according to different feature types.
For discrete features, such as "abnormal sound", "machine vibration abnormality", etc. The selection of the discrete features cannot be randomly selected from all the fields, and needs to be determined according to the occurrence frequency of different fields to ensure that the feature distribution of the generated sample and the finally obtained balanced data set is not changed.
For continuous features, such as "temperature data of the device itself", etc. During data generation, the continuous characteristics need to be considered in [ min, max ]]The value is taken in the interval, so the maximum value and the minimum value of the characteristic value need to be calculated, and the data are generated in [ min, max ]]And randomly selecting data as a generation value in the interval. To KcP characteristics of L different classes in each class cluster, and the calculation dimension is (K)c× L × p × 2).
(6) For each seed sample xiCorresponding category yiClass c ofiThere is a new number of samples N that need to be generatedgiEach time a new sample is generated, according to NgiAnd distribution FD [ c ] of each feature of the cluster in which it is locatedi][yi]First, auxiliary sample x is generatedtempThen linear interpolation is carried out to obtain the final generated sample xnew
The SC-SMOTE algorithm firstly constructs an auxiliary sample x according to the feature distributiontemp. Auxiliary sample xtempThree rules need to be satisfied:
temporary sample xtempAnd sample xiLabels y belonging to the same categoryi
Temporary sample xtempAnd sample xiBelong to the same cluster ci
Temporary sample xtempAnd sample xiHaving the same characteristics, but the characteristic values of the respective characteristics are according to the class ciCharacteristic distribution of (F c)i][yi]Obtaining by random sampling;
obtaining a temporary sample xtempThen, a new sample x can be obtained by linear interpolationnew
xtemp=[f1,f2,…,fp],fp=Random(FD[ci][yi][p]) (2.6)
Xnew=x+Random(0,1)×(xtemp-x) (2.7)
Cycling seed samples for NgjAfter the sub-sample generating operation, a group of generated samples based on the seed sample are obtained, and the generated samples and the seed sample belong to the same category. After each seed sample finishes the sample generation, the obtained generated sample set DgCombined with the original data set D to obtain the final required balanced data set Dbalance. And in the balanced data set, the proportion of the majority classes and the minority classes is recovered to be normal, and the integral sample number is also expanded.
(7) For the finally obtained samples, the data format is defined as M × N, where M is the number of samples, representing descriptions of different power devices. N is the number of features, including device temperature, device image features, device parameter features, and context features. In feature processing, it is common practice to discretize continuous features. The discrete features can make the data matrix extremely sparse after being coded, and if the data matrix is not effectively processed, the parameter quantity of the subsequent modeling process can be increased greatly. The main function of the data embedding layer is to compress and represent the sparse vectors after the one-hot coding. The dimensionality of the data vector passing through the embedded layer is remarkably reduced, and the characteristic information is mainly represented in a numerical form. Suppose that the feature vector is represented as x after being subjected to one-hot coding1;x2;…;xn]Where n is the number of feature fields, xiIs a one-hot code representation of the feature field i. The size of the embedding layer matrix V is n × k, k being the size of the embedding layer vector.
After passing through the embedding layer, the sparse vectors will be encoded into dense vectors of equal length, and the output of the embedding layer is set as E, as shown in equation 2.8.
E=[e1,e2,…,en]=[v1x1,v2x2,…,vnxn] (2.8)
Wherein e isiRepresenting a feature domain vector. For single valued features, each xiOf which only one bit is 1,the feature domain vector represents a feature vector. For multiple features, e at this timeiWith a plurality of vectors. Finally, the embedded representation of the data set is completed.
And 3, defining a preprocessing model based on an attention mechanism and a GRU network on the basis of the sample embedded representation obtained in the step 2. The preprocessing model comprises a module for capturing state sequence characteristics of a single device in two dimensions of space and time and a module for acquiring auxiliary information of related devices around the single device.
The specific steps of the step 3 comprise:
(1) a state trend capture layer (GRU module) is defined. After the behavior sequence data are expressed in an embedding mode, the sequential relation of the behavior sequence is modeled by using a GRU network, and the sequential relation is shown in a formula 2.18.
Figure BDA0003486959580000091
Wherein σ denotes a Sigmoid function, which indicates a hadamard product,
Figure BDA0003486959580000092
wherein n ishidRepresenting GRU network hidden layer size, k representing embedded vector size, itRepresenting the input of the GRU, representing the t-th vector representation in the sequence of behaviors, i.e. it=eb[t]Output value h of the networktRepresenting the t-th hidden state is a potential expression of the past state of the power equipment. The main role of the state trend capture layer is to provide the temporal characteristics of the interest representation.
(2) A critical state awareness layer (attention mechanism module) is defined. The association of the current power equipment with the power equipment state at different time points in the history sequence is obtained by using an attention mechanism, and the process is measured by similarity, and can be regarded as a process of perceptual evolution, as shown in formula 2.19.
Figure BDA0003486959580000101
Wherein e isaIs the target vector, F.]An attention scoring function is represented, where the calculation of the attention distribution is performed using a bilinear approach. The state sequence characteristic representation O at the moment can be obtained by multiplying the hidden state of the power equipment at different positions by the attention scoreb. The key role of the key state perception layer is to provide local characteristics of the trend representation. The capturing of state sequence characteristics of a single device in two dimensions of space and time is completed.
(3) A graph attention mechanism layer (graph attention mechanism module) is defined, and a modeling mode of fully utilizing environmental characteristics is required while feature extraction, namely modeling of a behavior sequence, is ensured. An embedded vector of a portrait feature of a certain power equipment is epThe embedded vector of its associated power device node is e'p1,e′p2,…,e′pnA new embedding vector e 'is then generated for each power device node using a graph attention mechanism'pThe formula is shown in 2.20.
e′p=[ep,e′p1,e′p2,…,e′pn,X] (2.20)
Wherein the content of the first and second substances,
Figure BDA0003486959580000102
x is a coefficient set, and graph attention is carried out on the feature vector of each power device to obtain an output vector e'pAnd finishing acquiring the auxiliary information of the peripheral related equipment for the single equipment.
(4) And (3) fusing the embedded representation of the features and the embedded representation of the labels obtained in the steps (2) and (3) as the input of the classification neural network of the power equipment fault detection model, wherein the final result is shown as a formula 2.21.
Figure BDA0003486959580000103
Wherein, ypred∈{0,1,2},eaFor vector representation of the prediction target, Wdeep,bdeepFor outputting layer parameters, the combination form among high-order features can be better captured by stacking multiple layers. And finally, constructing a power equipment fault detection model.
And 4, firstly, according to the embedded expression generated in the step 2, the embedded expression is used as the input of the preprocessing model obtained in the step 3, then the output of the preprocessing model is used as the input of the power equipment fault detection model, and finally the power equipment fault detection model is trained and generated.
The specific steps of the step 4 comprise:
(1) the network architecture adopted in the invention is realized based on an attention mechanism and a GRU network. Firstly, generating a corresponding embedded representation for the constructed balance sample based on the embedded layer in the step 2; secondly, defining a GRU module based on an attention mechanism, wherein the GRU module combines the attention mechanism and the GRU to capture time and space key features (defined as state sequence features) of the equipment nodes, and combines the attention mechanism to enable a single power equipment node to obtain auxiliary information of related power equipment, and fusing the generated features through vector connection. Finally, the attention mechanism defined in step 3 is used in combination with the network of GRUs, on one hand, the capturing capability of the state sequence features is learned, and on the other hand, the fault detection of the power equipment is realized through the learning of the features, and the network architecture is shown in fig. 4.
(2) The number of iterations of training, epochs, is set, starting with epochs equal to 1.
(3) Obtaining a prediction of the input embedded representation by obtaining a dataset sample embedded representation in step 2) and then embedding the data in a network representing a batch input attention mechanism in combination with GRUs.
(4) The loss function minimization of the estimated value and the true tag value is calculated.
(5) And (4) repeating the steps in (3) and (4) within the value range defined by the epochs, and finally training a power equipment fault detection model for performing data preprocessing by combining the attention mechanism with the GRU.
In an application system for classification problems and detection problems, the main concern is the feature extraction capability, and the concern is whether the features can be sufficiently mined and utilized. The innovation of the method is mainly based on the combination of GRU and an attention mechanism, so that on one hand, characteristics on time and space can be obtained, on the other hand, the attention mechanism can be used for utilizing and excavating state sequence characteristics, and meanwhile, characteristic noise can be eliminated to a certain extent; besides, in the invention, besides considering the state feature extraction of the power equipment, a graph attention mechanism is also used for obtaining auxiliary information which can be generated by the relevant equipment. The characteristics of the power equipment can be fully extracted and utilized based on the two aspects. The network model can better learn characteristics, and the characteristics are used for realizing more accurate fault detection of the power equipment. In order to solve the problems, many methods can choose to make some improvements in the directions of deepening the network depth, multi-mode fusion and the like. The method provided by the method is different from the prior art, and mainly realizes characteristic multi-dimensional mining of input data by combining an attention mechanism and a GRU technology, so that more characteristic information is obtained, and the fault detection capability of a network model is improved.
The design of the method is based on the attention mechanism and combined with the GRU network to fully mine the characteristics of the power equipment, and the method can better serve the classification detection task of the downstream neural network. The attention mechanism is combined with a GRU module, a multi-head attention mechanism and a GRU network model are used, firstly, a GRU network is used for extracting time and space characteristics of the power equipment, then, the attention mechanism is used for capturing state sequence characteristics, and the characteristic excavation of the power equipment is completed; in addition, in a graph attention mechanism model, a graph attention mechanism is used, auxiliary information of relevant equipment nodes is fully mined, and the environmental information is fully mined; and combining the information with the embedded representation of the label to perform attention alignment fusion, generating a vector with the same size as the state code input into the downstream neural network, outputting the vector as the input of the downstream neural network, performing countermeasure training with the label, and finally generating a power equipment fault detection model.
Based on the improvement, the power equipment fault detection model based on the attention mechanism combined with the GRU is realized. The method can effectively improve the accuracy of the fault detection of the power equipment.
The working principle of the invention is as follows:
the method comprises the steps of firstly carrying out SC-SMOTE up-sampling on a power grid power equipment sample to generate balance sample data, then carrying out embedded representation on an input sample by utilizing an embedded layer, then using the generated embedded representation as an attention mechanism combined with GRU network model input, firstly fully mining the input power equipment characteristics by utilizing the attention mechanism combined with GRU module and graph attention mechanism model, then carrying out attention mechanism-based fusion on the characteristics generated by each module, finally using the characteristics as the input of a downstream classification neural network, outputting and carrying out countermeasure training with a defined label, and finally completing model training to generate a model capable of accurately detecting the power equipment fault.
It should be emphasized that the embodiments described herein are illustrative and not restrictive, and thus the present invention includes, but is not limited to, the embodiments described in this detailed description, as well as other embodiments that can be derived by one skilled in the art from the teachings herein, and are within the scope of the present invention.

Claims (10)

1. A power equipment fault detection model based on an attention mechanism and combined with GRU is characterized by comprising a classification neural network model, wherein training data of the classification neural network model is derived from a preprocessing model, the preprocessing model comprises an up-sampling module, a word embedding representation learning module, a GRU module, an attention mechanism module and a graph attention mechanism module,
the up-sampling module is used for converting input unbalanced power equipment data into balanced data;
the word embedding representation learning module is used for embedding and representing the balance data and outputting an embedding representation based on a history state sequence, a tag data embedding representation and power equipment portrait characteristics of power equipment representation;
the GRU module is used for extracting time and space characteristics of the electric power equipment from a history state sequence which is output by the word embedding representation learning module and is based on the electric power equipment representation;
the attention mechanism module is used for extracting state sequence features from the time and space features of the equipment;
the graph attention mechanism module is used for extracting environment information of the power equipment from the embedded representation of the power equipment portrait characteristics;
and performing alignment fusion on the state sequence features, the label data embedded representation and the environment information to serve as training data input of the classification neural network.
2. The power device fault detection model of claim 1, wherein the upsampling module employs an upsampling algorithm that is an SC-SMOTE upsampling algorithm; the SC-SMOTE upsampling algorithm specifically comprises:
step 21: traversing input power equipment data set data, and determining a majority seed sample and a minority seed sample;
step 22: according to the seed sample information, simultaneously performing upsampling on the majority class and the minority class, and calculating the number of samples generated by each minority type of seed samples;
step 23: after the number of samples generated by each few types of seed samples is obtained, carrying out linear interpolation to obtain a final new sample, and combining the newly generated sample and the original seed sample together to generate a balanced sample data set;
step 24: and carrying out embedded representation on the data in the generated balance sample data set.
3. The power equipment fault detection model of claim 2, wherein said step 21 comprises: traversing the power equipment data set data, and determining a neighbor sample set D of the sample x by using a KNN algorithmnIn neighbor set DnIn (2), samples of the same class as sample x are set DsameThe set of samples of the different class from sample x is called Dother(ii) a Comparison DsameNumber of samples and DotherAccording to the formula:
Figure FDA0003486959570000021
and judging whether the sample x is a seed sample, and adding a seed sample label S to the original data set.
4. The power equipment fault detection model of claim 2, wherein the calculation of the number of generated samples of step 22 comprises the following equation:
label_diffj=Nmaj-Nj
Figure FDA0003486959570000022
Figure FDA0003486959570000023
wherein, label _ diffjRepresenting majority and minority classes C in the original datasetjThe sample number difference of (2); n is a radical ofjIndicates belonging to class CjThe number of samples of (a); ds_jIndicates belonging to class CjThe set of seed samples of (a); rsjRepresenting majority seed samples and class CjThe proportion of the seed sample of (a); n is a radical ofgjRepresents each of the categories CjThe number of new samples generated by averaging the seed samples; label _ diffj/|Ds_jI represents class CjThe number of samples that need to be generated in order to balance the difference in the number of raw data.
5. The power equipment fault detection model of claim 2, characterized in that said step 23 comprises the steps of: after the number of samples generated by the seed samples is obtained, the K-means algorithm is adopted according to the clusterUpdating cluster-like center coordinates while iteratively dividing samples each time according to Euclidean distances between the center and the sampling samples; wherein the hyper-parameter K of the K-means algorithmcRepresenting the number of clusters of classes, the hyperparameter kcThe value of (d) depends on the ratio of the number of majority classes to the number of minority classes in the dataset, and is formulated as:
Figure FDA0003486959570000024
after clustering the data set by adopting a K-means algorithm, marking a cluster label C of the cluster for each sample, and updating the data set as follows:
Figure FDA0003486959570000025
6. the power equipment fault detection model of claim 5, wherein the data processing method of the newly generated samples in the step 23 is as follows: screening out samples of the same category in each category cluster to form a sample set DcEach sample contains a feature set F ═ F1,f2,...,fpAnd then according to different feature types, executing: when the characteristics are discrete characteristics, field selection is carried out according to probability distribution of different fields in the data generation process; when the characteristic is a continuous characteristic, in the data generation process, the [ min, max ] of the characteristic value is taken]And randomly selecting data in the interval as a generated value, wherein max and min are respectively the maximum value and the minimum value of the characteristic value.
7. The power equipment fault detection model of claim 5, wherein the linear interpolation in step 23 is to obtain the final new sample by:
for each seed sample xiCorresponding category yiClass c ofiThere is a new number of samples N that need to be generatedgiEach time, a new sample is generatedAt this time, according to NgiAnd distribution FD [ c ] of each feature of the cluster in which it is locatedi][yi]First, auxiliary sample x is generatedtempThen linear interpolation is carried out to obtain the final generated sample xnew
Wherein the auxiliary sample xtempThe construction of (c) needs to satisfy three rules:
temporary sample xtempAnd sample xiLabels y belonging to the same categoryi
Temporary sample xtempAnd sample xiBelong to the same cluster ci
Temporary sample xtempAnd sample xiHaving the same characteristics, but the characteristic values of the respective characteristics are according to the class ciCharacteristic distribution of (F c)i][yi]Randomly sampling to obtain;
obtaining a temporary sample xtempThen, a new sample x is obtained by means of linear interpolationnew
xtemp=[f1,f2,...,fp],fp=Random(FD[ci][yi][p])
Xnew=x+Random(0,1)×(xtemp-x)
Cycling seed samples by NgjAfter the secondary sample generation operation, obtaining a group of generated samples based on the seed sample, wherein the generated samples and the seed sample belong to the same category; after each seed sample finishes the sample generation, the obtained generated sample set DgMerging with the original data set D to obtain the final required balance data set Dbalance
8. The power equipment fault detection model of claim 1, wherein the data outcome of the classification neural network model is represented as:
Figure FDA0003486959570000031
wherein, ypred∈{0,1,2},ObIs a vector representation of the state sequence features; e'pIs a state feature vector representation of the power equipment; e.g. of the typeaFor vector representation of predicted target, Wdeep,bdeepAre output layer parameters.
9. The power equipment fault detection model of claim 8, wherein the vector of state sequence features represents ObThe GRU module and the attention mechanism module are used for calculation;
the data processing procedure and result of the GRU module are expressed as:
rt=σ(Writ+Urht-1+br)
zt=σ(Wzit+Uzht-1+bz)
Figure FDA0003486959570000041
Figure FDA0003486959570000042
wherein σ denotes a Sigmoid function, which indicates a hadamard product,
Figure FDA0003486959570000043
wherein n ishidRepresenting GRU network hidden layer size, k representing embedded vector size, itRepresenting the input of the GRU, representing the t-th vector representation in the history state sequence, i.e. it=eb[t](ii) a Output value h of the GRU moduletThe t hidden state is represented and is a potential expression form of the past state of the power equipment;
the data processing procedure and the result of the attention mechanism module are expressed as follows:
Figure FDA0003486959570000044
Figure FDA0003486959570000045
wherein, atAn attention score representing an attention distribution calculation of the attention mechanism module; f.]Representing the attention scoring function.
10. The power device fault detection model of claim 8, wherein the state feature vector of the power device represents e'pThe calculation of the drawing attention mechanism module is used for obtaining; the data processing results of the graph attention mechanism module are expressed as:
e′p=[ep,e′p1,e′p2,...,e′pn,X]
wherein e ispRepresenting an embedded vector of features for the power device; e'p1,e′p2,...,e′pnFor embedding the vector as epThe embedded vector of the associated power device node of the power device;
Figure FDA0003486959570000051
x is a coefficient set;
processing the feature vector of each power device by using a graph attention mechanism to obtain an output vector e'p
CN202210084475.8A 2022-01-25 2022-01-25 Power equipment fault detection model based on attention mechanism combined with GRU Pending CN114528755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210084475.8A CN114528755A (en) 2022-01-25 2022-01-25 Power equipment fault detection model based on attention mechanism combined with GRU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210084475.8A CN114528755A (en) 2022-01-25 2022-01-25 Power equipment fault detection model based on attention mechanism combined with GRU

Publications (1)

Publication Number Publication Date
CN114528755A true CN114528755A (en) 2022-05-24

Family

ID=81621151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210084475.8A Pending CN114528755A (en) 2022-01-25 2022-01-25 Power equipment fault detection model based on attention mechanism combined with GRU

Country Status (1)

Country Link
CN (1) CN114528755A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115512460A (en) * 2022-09-29 2022-12-23 北京交通大学 High-speed train axle temperature long-time prediction method based on graph attention model
CN115952064A (en) * 2023-03-16 2023-04-11 华南理工大学 Multi-component fault interpretation method and device for distributed system
CN116401515A (en) * 2023-06-07 2023-07-07 吉林大学 Ocean current prediction method for ocean observation data
CN117370790A (en) * 2023-10-13 2024-01-09 江苏智谨创新能源科技有限公司 Automatic fault alarm method and system for photovoltaic power generation assembly
CN117725210A (en) * 2023-11-16 2024-03-19 南京审计大学 Malicious user detection method for social question-answering platform

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115512460A (en) * 2022-09-29 2022-12-23 北京交通大学 High-speed train axle temperature long-time prediction method based on graph attention model
CN115512460B (en) * 2022-09-29 2024-04-16 北京交通大学 High-speed train shaft temperature long-time prediction method based on graph attention model
CN115952064A (en) * 2023-03-16 2023-04-11 华南理工大学 Multi-component fault interpretation method and device for distributed system
CN115952064B (en) * 2023-03-16 2023-08-18 华南理工大学 Multi-component fault interpretation method and device for distributed system
CN116401515A (en) * 2023-06-07 2023-07-07 吉林大学 Ocean current prediction method for ocean observation data
CN117370790A (en) * 2023-10-13 2024-01-09 江苏智谨创新能源科技有限公司 Automatic fault alarm method and system for photovoltaic power generation assembly
CN117725210A (en) * 2023-11-16 2024-03-19 南京审计大学 Malicious user detection method for social question-answering platform

Similar Documents

Publication Publication Date Title
CN114528755A (en) Power equipment fault detection model based on attention mechanism combined with GRU
CN109949317B (en) Semi-supervised image example segmentation method based on gradual confrontation learning
CN112347859B (en) Method for detecting significance target of optical remote sensing image
CN108985380B (en) Point switch fault identification method based on cluster integration
Jin et al. Video-text as game players: Hierarchical banzhaf interaction for cross-modal representation learning
CN111783540B (en) Method and system for recognizing human body behaviors in video
CN107423747A (en) A kind of conspicuousness object detection method based on depth convolutional network
CN117237559B (en) Digital twin city-oriented three-dimensional model data intelligent analysis method and system
CN115168443A (en) Anomaly detection method and system based on GCN-LSTM and attention mechanism
CN114487673A (en) Power equipment fault detection model based on Transformer and electronic equipment
CN115908908A (en) Remote sensing image gathering type target identification method and device based on graph attention network
Du et al. Convolutional neural network-based data anomaly detection considering class imbalance with limited data
Shajihan et al. CNN based data anomaly detection using multi-channel imagery for structural health monitoring
CN115661480A (en) Image anomaly detection method based on multi-level feature fusion network
CN116071352A (en) Method for generating surface defect image of electric power safety tool
CN114897085A (en) Clustering method based on closed subgraph link prediction and computer equipment
CN116894180B (en) Product manufacturing quality prediction method based on different composition attention network
Yuan et al. A fusion TFDAN-Based framework for rotating machinery fault diagnosis under noisy labels
Li et al. Dual-source gramian angular field method and its application on fault diagnosis of drilling pump fluid end
CN117152504A (en) Space correlation guided prototype distillation small sample classification method
CN115664970A (en) Network abnormal point detection method based on hyperbolic space
CN116383747A (en) Anomaly detection method for generating countermeasure network based on multi-time scale depth convolution
CN113835964B (en) Cloud data center server energy consumption prediction method based on small sample learning
CN113342982B (en) Enterprise industry classification method integrating Roberta and external knowledge base
CN115659135A (en) Anomaly detection method for multi-source heterogeneous industrial sensor data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination