CN116879671A

CN116879671A - District fault identification method based on time convolution network and attention mechanism

Info

Publication number: CN116879671A
Application number: CN202310636069.2A
Authority: CN
Inventors: 丁冬; 张立; 陈佳瑜; 史光宇; 陆增洁; 张希鹏; 李亦言
Original assignee: State Grid Shanghai Electric Power Co Ltd
Current assignee: State Grid Shanghai Electric Power Co Ltd
Priority date: 2023-05-31
Filing date: 2023-05-31
Publication date: 2023-10-13

Abstract

A transformer area fault identification method based on a time convolution network and an attention mechanism belongs to the field of fault identification. Applying a time convolution network to fault identification of a total table of the station area; the identification of faults is realized by extracting high-dimensional features in the time sequence of the measuring values of the station areas; adopting two different fault identification strategies of daily fault identification and annual fault identification to test the validity of the model; introducing an attention mechanism in fault identification of the annual sequence to strengthen the screening capability of key information fragments; the monitoring data of the transformer areas are automatically and batched analyzed, the total table of the transformer areas with faults is directly positioned, the interference of other unreasonable factors in the lines or the transformer areas is eliminated, and the economic operation rate of the lines is improved; the method can be directly applied to online identification of total fault of the area, reads the area measurement data in real time and rapidly judges whether the fault exists or not, and provides reference for arrangement of maintenance plans. The method can be widely applied to the fields of operation management and accident identification of transformer in the transformer area.

Description

District fault identification method based on time convolution network and attention mechanism

Technical Field

The invention belongs to the field of fault identification, and particularly relates to a station area fault identification method based on a time convolution network and an attention mechanism.

Background

The total surface of the transformer station area is related to the electricity quantity of the 10kV line electricity selling side and the electricity quantity of the station area electricity supplying side, so that the transformer station area plays a role in starting up and down. The electric energy abnormality caused by the total surface faults of the transformer area can simultaneously influence the calculation of 10kV and transformer area line loss, so that the reasonable rate of the line loss is reduced, the monitoring of the electric quantity of the transformer area is influenced, the load rate of the transformer area is misestimated, and the power supply reliability is not facilitated.

The effective identification of the total fault of the area is the key for solving the problems.

Because the number of the line-associated transformer areas is large, the electric quantity of a single transformer area is small, fluctuation is not obvious, influence factors are complex and the like, the abnormal electric quantity of the transformer area is a common problem in 10kV line loss treatment, and whether the abnormal electric quantity is caused by the total fault of the transformer area is difficult to accurately judge. At present, investigation is mainly performed by means of manual analysis and field verification, so that the accuracy is low and obvious labor cost is brought. There is a need to establish a more accurate, more automated and more real-time method for identifying total faults.

Common faults in the total table of the transformer area comprise abnormal collection, inconsistent field and file multiplying power, meter voltage and current losing, meter voltage and current polarity misconnection and the like.

At present, direct researches on total surface faults of a platform area are relatively few, and the direct researches mainly focus on analysis of line loss anomalies in the platform area.

For example, an unsupervised learning method such as k-means clustering is adopted to cluster original line loss data, so that fault reasons and line loss abnormality degrees corresponding to different categories are analyzed, and support is provided for overhaul decision. Or adopting a neural network and other supervised learning algorithms to classify or regress the marked line loss data, thereby realizing the judgment of line loss faults.

However, the degree of mining and utilization of the existing data by the above existing analysis method still needs to be improved, specifically:

1) Spatio-temporal correlation of neighboring cell data. There is a high correlation between adjacent areas in electrical characteristics, climate characteristics, industrial structures and the like, and certain 'this cancellation' characteristics may exist due to mismatch of the variant-user correspondence and the like. The space-time correlation between adjacent area data is analyzed, so that the research and judgment of the total fault are facilitated;

2) The correlation exists among a plurality of measured variables such as voltage, current, power, three-phase unbalance rate and the like, the traditional identification method is often aimed at analyzing a single variable, and the correlation among the variables is not fully utilized to improve the fault identification precision;

3) Focusing and extracting key fragments in long sequence data. The data of the platform area is often kept relatively intact, and the length of the platform area can reach a plurality of years. In this rich history, however, the anomaly data may only exist in a few short segments.

How to automatically extract effective fault related information from a large amount of normal data, analyze and utilize the effective fault related information, and consider the conciseness and efficiency of a model, and the effective fault related information is not solved well at present.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a method for identifying the fault of the station area based on a time convolution network and an attention mechanism. The method utilizes the deep learning tool, and can directly locate the total table of the area with faults by carrying out automatic and batch analysis on the area monitoring data, thereby eliminating the interference of other unreasonable factors in the line or the area, and further improving the economic operation rate of the line. Specifically, the method and the system perform the identification of the total table faults of the areas based on the time convolution network (Temporal Convolutional Nets, TCN), and can efficiently extract the high-dimensional characteristics in the long sequences and perform the fault identification. Further, attention mechanisms (Attention) are introduced in fault identification of annual sequences to enhance the screening ability for key information segments. After the model is trained, the model can be directly applied to online identification of total fault of the platform region, real-time reading of platform region measurement data and rapid judgment of whether faults exist in the platform region measurement data are realized, and a reference is provided for arrangement of maintenance plans.

The technical scheme of the invention is as follows: the utility model provides a district fault identification method based on a time convolution network and an attention mechanism, which is characterized in that:

1) Applying the time convolution network to fault identification of the total table of the station area;

2) The receptive field of the network is improved through a time convolution network, and the complexity of the network is reduced;

3) The identification of faults is realized by extracting high-dimensional features in the time sequence of the measuring values of the station areas;

4) Adopting two different fault identification strategies of daily fault identification and annual fault identification to test the validity of the model;

5) Introducing an attention mechanism in fault identification of the annual sequence to strengthen the screening capability of key information fragments;

6) The monitoring data of the transformer areas are automatically and batched analyzed, the total table of the transformer areas with faults is directly positioned, the interference of other unreasonable factors in the lines or the transformer areas is eliminated, and the economic operation rate of the lines is improved;

7) After the model is trained, the model can be directly applied to online identification of total fault of the area, the area measurement data are read in real time, whether faults exist in the area measurement data are judged rapidly, and a reference is provided for arrangement of maintenance plans.

Specifically, the essence of the time convolution network is a full convolution network with an expansion structure; let the convolution kernel be denoted as f= [ F ₀ ,f ₁ ,…,f _M-1 ]Wherein M is the number of convolution kernels; then for element X in the input matrix X ^m Is a dilation convolution operation of (a)Can be expressed as

Wherein d is the expansion ratio, and m-d.i is x ^m A pointer to a previous element; when d=1, the network then degenerates into a conventional convolutional network; when d>1, the expansion convolution operation skips the (d-1)/d elements in the previous layer and focuses on only the remaining 1/d elements; this feature can significantly improve the receptive field of the network and reduce the complexity of the network.

Further, the receptive field of the network is calculated using the formula:

where K is the size of the convolution kernel, N _stack The number of stacks for the network;

through the network, the high-dimensional features in the input data matrix are extracted and compressed to the output layer and further used for fault identification.

Specifically, the daily degree fault identification is used for judging whether the transformer is abnormal on the same day by reading a daily degree measurement value; in the solar fault identification, a training sample set is firstly established, wherein the training sample set comprises a normal sample and an abnormal sample, and each sample is a 96 multiplied by 15 data matrix; wherein the normal sample is marked 0; the abnormal sample is marked 1.

Further, the date fault identification is carried out, firstly, the original data is preprocessed, and the days, in which the missing data in each transformer exceeds 10%, are deleted; and completing the days of which the missing data is less than 10%; performing complementation by adopting correlation analysis or linear extrapolation; on the basis, further screening a daily degree sample with a clear abnormal label; meanwhile, randomly selecting required daily degree samples from a normally operating platform region to form a training set together.

Furthermore, the attention mechanism is not adopted in the recognition of the solar degree anomaly, and only the classical time convolution network structure is used.

Specifically, the annual fault identification reads the measurement value of the whole year to determine whether the transformer has faults in the year; in the above-mentioned annual fault identification, the model input is an annual measurement data matrix, which is typically a long time series.

Further, in the annual fault identification, the model input is an annual measurement data matrix, and the size is [396× 96,15]; similar to the identification of the daily degree abnormality, 70 transformer areas with clear fault marks and good data conditions in the target year are firstly screened out and marked as 1; meanwhile, randomly selecting 100 marks as 0 in a normal station area to form a training set together; for missing data in the data, uniformly filling the missing data into 0; similarly, the data set of 170 samples is divided into a training set, a testing set and a verification set; for long sequence identification, attention mechanisms are introduced, and key fault information fragments are screened out.

Specifically, the calculation method of the multi-head attention mechanism is as follows:

MultiHead(Q,K,V)＝concate(head ₁ ,head ₂ ,...,head _n )W ^O

wherein the method comprises the steps ofAnd W is ^O Is a parameter matrix which can be learned in the mapping process;

based on the attention mechanism, key information fragments in the input sequence can be effectively screened out and are not influenced by the distance between the information fragments

When long sequences are identified, the attention layer is added into the time convolution network model to serve as an intermediate layer, so that long sequences are processed.

The method for identifying the fault of the platform area based on the time convolution network and the attention mechanism combines the time convolution network and the attention mechanism to efficiently extract the information in the long-time sequence and control the scale of the network parameters in an acceptable range so as to ensure the calculation efficiency; on the basis, attention mechanisms are further introduced as an intermediate layer of the time convolution network; the information fragments with the most value for the result are screened out by calculating the correlation among different elements in the comparison input sequence; by combining the time convolution network and the attention mechanism, the method ensures that information is efficiently and accurately read from a long-time sequence and is used for fault identification.

Compared with the prior art, the invention has the advantages that:

aiming at the defect that the existing method cannot consider the model efficiency and effect in processing long-time sequence, the technical scheme of the invention combines a time convolution network and an attention mechanism; the time convolution network can efficiently extract information in a long time sequence and control the scale of network parameters within an acceptable range to ensure the calculation efficiency. On the basis, attention mechanisms are further introduced as an intermediate layer of the time convolution network; the essence of the attention mechanism is automatic weighting of the input sequence, and the information fragments with the most value for the result are screened out by calculating and comparing the correlation between different elements in the input sequence; the two characteristics ensure that the technical scheme of the patent can efficiently and accurately read information from a long-time sequence and serve fault identification.

Drawings

FIG. 1 is a schematic block diagram of a process flow of the present invention;

FIG. 2 is a schematic diagram of the training process of TCN and CNN-LSTM models.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

As shown in fig. 1, in order to automatically extract and analyze effective fault related information from a large amount of normal data and consider the simplicity and efficiency of a model, the technical scheme of the invention provides a method for identifying a fault in a platform area based on a time convolution network and an attention mechanism, which is characterized in that:

1. Time convolution network:

a time-convolutional network structure, which is essentially a full-convolutional network with an expanded structure.

Let convolution kernel tableShown as f= [ F ₀ ,f ₁ ,…,f _M-1 ]Wherein M is the number of convolution kernels; then for element X in the input matrix X ^m Is a dilation convolution operation of (a)Can be expressed as

Wherein d is the expansion ratio, and m-d.i is x ^m Pointers to previous elements.

When d=1, the network then degenerates into a conventional convolutional network.

Whereas when d >1, the dilation convolution operation will skip the (d-1)/d elements in the previous layer and focus on only the remaining 1/d elements.

This feature can significantly improve the receptive field of the network and reduce the complexity of the network.

The receptive field of the model can be calculated using the following formula:

where K is the size of the convolution kernel, N _stack Is the number of stacks of the network.

According to a receptive field calculation formula of the model, the receptive field of the network can be increased by means of increasing the size of convolution kernels, increasing the number of stacks, improving the expansion rate and the like.

Through the network, high-dimensional features in the input data matrix can be extracted and compressed to the output layer and further used for fault identification.

Compared with the traditional cyclic neural network (Recurrent Neural Network, RNN) or Long-short-term memory network (Long-short Term Memory, LSTM) and other cyclic structures, the TCN network has no time sequence structure, and the training process can be highly parallelized, so that the TCN network has higher efficiency.

Currently, TCN networks have been successfully applied to the prediction problems of PM2.5, load, etc., but have not been applied to the field of fault identification of transformers.

2. Attention mechanism:

attention mechanisms are techniques that extract pieces of key information from an input sequence.

The attention function may be described as a mapping between pairs of Query (Q) and Key-Value (K) -Value item (Value, V) data to output, as shown in the following equation:

wherein Q, K, V are vectors and include trainable parameters, and alpha is a scaling factor.

On the basis of using a single attention function, it was found that the parallel adoption of multiple attention functions will help to promote the overall performance, i.e. Multi-head attention mechanism.

The calculation method of the multi-head attention mechanism is shown as follows:

MultiHead(Q,K,V)＝concate(head ₁ ,head ₂ ,...,head _n )W ^O

wherein the method comprises the steps ofAnd W is ^O Is a parameter matrix that can be learned during the mapping process.

Based on the attention mechanism, key information fragments in the input sequence can be effectively screened out and are not influenced by the distance between the information fragments.

When long sequences are identified, the attention layer can be added into the TCN model to serve as an intermediate layer, so that long sequences are processed.

According to the technical scheme, two different fault identification strategies are adopted to verify the validity of the model, namely, daily fault identification and annual fault identification.

The solar fault identification is used for judging whether the transformer is abnormal on the same day by reading a solar measurement value; the annual fault identification reads the measured value of the whole year to determine whether the transformer has faults in the year.

In the daily degree identification, a training sample set is first established, which includes a normal sample (marked as 0) and an abnormal sample (marked as 1), each sample being a 96×15 data matrix.

Through statistical analysis, the integrity of the data of the transformer area of 10kV in a certain city is about 80%, 20% of the data are missing, and the missing data of the daily measurement sample mainly comprise two situations:

1) Partial measurement values at the same moment are missing;

2) All measurements at the same time are missing.

Firstly, preprocessing original data, and deleting the days of which the missing data exceeds 10% in each transformer; and the number of days with missing data less than 10% was completed.

And for the situation that partial measurement values at the same time are missing, adopting correlation analysis completion.

And for the situation that all measured values at the same time are missing, linear extrapolation is adopted for complement.

On this basis, a total of 364 daily degree samples with clear anomaly tags were further screened out, as shown in table 2. Meanwhile, 500 daily degree samples are randomly selected from a normally running platform region to form a training set together.

The total 864 training samples were split into training sets (70%, 605), test sets (20%, 173) and validation sets (10%, 65).

Training the model by adopting the training set and the verification set, and finally testing the performance of the model on the test set.

It should be noted that in the recognition of solar degree anomalies, because the sequence length is short, the attention mechanism is not adopted, and only the classical TCN structure is used.

The model parameters of the TCN are set as shown in the following table, the receptive field is 161 and is greater than the length 96 of the daily degree sequence, and the information contained in the daily degree sequence can be completely extracted.

TABLE 1TCN daily fail identification model parameter settings

FIG. 2 shows the change in the loss function during training of the TCN model with a convolutional-long short term memory network model (CNN-LSTM) with similar overall parameter scale.

It can be seen that the TCN model has a faster convergence rate and test accuracy than the CNN-LSTM model.

Training of the TCN model takes about 3 minutes and becomes substantially stable after 40 iterations.

The TCN model after training is applied to the test set, and the comprehensive identification precision is 93.1%.

In the technical scheme of the invention, in the annual fault identification, the model input is an annual measurement data matrix, and the size is [396 multiplied by 96,15], which is a typical long-time sequence.

Similar to the identification of the daily abnormality, 70 transformer areas with clear fault marks and good data conditions in the target year are screened out first, and marked as 1. Meanwhile, 100 marks are randomly selected from the normal station area to be 0, and a training set is formed together.

For missing data therein, it is uniformly filled with 0. Similarly, the data set of 170 samples was split into training sets (70%, 119), test sets (20%, 34) and validation sets (10%, 17). For long sequence identification, the critical fault information fragments need to be screened, so that an attention mechanism is introduced.

The model parameters are shown in Table 2.

TABLE 2TCN annual fault identification model parameter settings

After model training is completed, it is applied to the test set to verify its recognition effect.

The comprehensive identification precision before the introduction is 67.6%, and the comprehensive identification precision after the introduction is 85.3%, so that the precision is obviously improved, and particularly the identification precision of a fault sample is obviously improved.

Based on the actual example result of a 10kV district in a certain city, the proposed algorithm can realize 93.1% of daily fault identification precision and 85.3% of annual fault identification precision.

According to the technical scheme, by utilizing the deep learning tool, the total table of the station areas with faults can be directly positioned by carrying out automatic and batch analysis on the monitoring data of the station areas, and the interference of other unreasonable factors in the line or the station areas is eliminated, so that the economic operation rate of the line is improved.

The technical scheme of the invention provides a total fault identification method for a platform area based on a time convolution network (Temporal Convolutional Nets, TCN). The TCN has an expansion convolution structure, and can efficiently extract high-dimensional features in a long sequence and perform fault identification. Further, attention mechanisms (Attention) are introduced in fault identification of annual sequences to enhance the screening ability for key information segments. After the model is trained, the model can be directly applied to online identification of total fault of the platform region, real-time reading of platform region measurement data and rapid judgment of whether faults exist in the platform region measurement data are realized, and a reference is provided for arrangement of maintenance plans.

The technical scheme of the patent aims at the defect that the existing method cannot give consideration to the model efficiency and effect in processing long-time sequences, and combines a time convolution network and an attention mechanism; the time convolution network can efficiently extract information in a long-time sequence and control the scale of network parameters in an acceptable range so as to ensure the calculation efficiency; on the basis, attention mechanisms are further introduced as an intermediate layer of the time convolution network; the essence of the attention mechanism is automatic weighting of the input sequence, and the information fragments with the most value for the result are screened out by calculating and comparing the correlation between different elements in the input sequence; the two characteristics ensure that the system can efficiently and accurately read information from a long-time sequence and serve fault identification.

The invention can be widely applied to the fields of operation management and accident identification of the transformer in the transformer area.

Claims

1. A method for identifying a fault of a platform area based on a time convolution network and an attention mechanism is characterized by comprising the following steps:

2. The method for identifying a fault in a region based on a time convolution network and an attention mechanism according to claim 1, wherein the time convolution network is a full convolution network with an expansion structure;

let the convolution kernel be denoted as f= [ F ₀ ,f ₁ ,…,f _M-1 ]Wherein M is the number of convolution kernels;

then for element X in the input matrix X ^m Is a dilation convolution operation of (a)Can be expressed as

Wherein d is the expansion ratio, and m-d.i is x ^m A pointer to a previous element;

when d=1, the network then degenerates into a conventional convolutional network;

when d >1, the dilation convolution operation will skip the (d-1)/d elements in the previous layer and focus on only the remaining 1/d elements; this feature can significantly improve the receptive field of the network and reduce the complexity of the network.

3. The method for identifying the fault of the platform area based on the time convolution network and the attention mechanism as set forth in claim 2, wherein the receptive field of the network is calculated by the following formula:

4. The method for identifying the fault of the transformer area based on the time convolution network and the attention mechanism as set forth in claim 1, wherein the daily degree fault identification is performed by reading a daily metric value to determine whether the transformer has an abnormality on the same day;

in the solar fault identification, a training sample set is firstly established, wherein the training sample set comprises a normal sample and an abnormal sample, and each sample is a 96 multiplied by 15 data matrix;

wherein the normal sample is marked 0; the abnormal sample is marked 1.

5. The method for identifying the fault of the transformer area based on the time convolution network and the attention mechanism as set forth in claim 4, wherein the method for identifying the fault of the degree of day is characterized in that firstly, the original data is preprocessed, and the number of days, in which the missing data in each transformer exceeds 10%, is deleted; and completing the days of which the missing data is less than 10%; performing complementation by adopting correlation analysis or linear extrapolation; on the basis, further screening a daily degree sample with a clear abnormal label; meanwhile, randomly selecting required daily degree samples from a normally operating platform region to form a training set together.

6. The method for identifying the fault of the area based on the time convolution network and the attention mechanism according to claim 1, wherein the attention mechanism is not adopted in the identification of the solar degree abnormality, and only a classical time convolution network structure is used.

7. The method for identifying a fault in a transformer area based on a time convolution network and an attention mechanism according to claim 1, wherein said annual fault identification reads in a measurement value of the whole year to determine whether a fault exists in the transformer in the year;

in the annual fault identification, the model input is an annual measurement data matrix, the size of which is 396× 96,15, and the model input is a typical long-time sequence.

8. The method for recognizing a district fault based on a time convolution network and an attention mechanism according to claim 7, wherein in the annual fault recognition, the model input is an annual measurement data matrix with a size of [396× 96,15];

similar to the identification of the daily degree abnormality, 70 transformer areas with clear fault marks and good data conditions in the target year are firstly screened out and marked as 1;

meanwhile, randomly selecting 100 marks as 0 in a normal station area to form a training set together;

for missing data in the data, uniformly filling the missing data into 0;

similarly, the data set of 170 samples is divided into a training set, a testing set and a verification set;

for long sequence identification, attention mechanisms are introduced, and key fault information fragments are screened out.

9. The method for identifying a fault in a platform based on a time convolution network and an attention mechanism according to claim 1, wherein the calculation method of the multi-head attention mechanism is as follows:

MultiHead(Q,K,V)＝concate(head ₁ ,head ₂ ,...,head _n )W ^O

wherein the method comprises the steps ofW _i ^K ，W _i ^V And W is ^O Is a parameter matrix which can be learned in the mapping process;

10. The method for identifying the fault of the platform area based on the time convolution network and the attention mechanism according to claim 1, wherein the method for identifying the fault of the platform area based on the time convolution network and the attention mechanism is combined with the time convolution network and the attention mechanism to efficiently extract information in a long time sequence and control the scale of network parameters in an acceptable range so as to ensure the calculation efficiency;

on the basis, attention mechanisms are further introduced as an intermediate layer of the time convolution network; the information fragments with the most value for the result are screened out by calculating the correlation among different elements in the comparison input sequence;

by combining the time convolution network and the attention mechanism, the method ensures that information is efficiently and accurately read from a long-time sequence and is used for fault identification.