CN117409890A

CN117409890A - Transformer fault self-identification method and system based on two-way long and short-time memory

Info

Publication number: CN117409890A
Application number: CN202311327677.1A
Authority: CN
Inventors: 满建越; 陈文选; 张光儒; 梁琛; 马振祺; 崔昕玮; 马喜平; 张建伟; 张家午; 陈杰
Original assignee: Wuwei Power Supply Co Of State Grid Gansu Electric Power Co
Current assignee: Wuwei Power Supply Co Of State Grid Gansu Electric Power Co
Priority date: 2023-10-13
Filing date: 2023-10-13
Publication date: 2024-01-16

Abstract

The invention discloses a transformer fault self-identification method and system based on two-way long and short-time memory, relates to the technical field of transformer fault diagnosis, and aims to solve the problems of strong data dependence, insufficient time sequence data processing capability, poor model generalization capability and high computational resource requirements of the existing transformer fault diagnosis. The method comprises the following steps: collecting fault data of a transformer; preprocessing data; dividing the preprocessed data into a training set and a testing set; constructing a deep learning framework based on a Bi-directional long and short Bi-LSTM model, converting the deep learning framework into a meta learner, performing meta learning optimization on the model by using a training set, and performing fine tuning on the model by using a small amount of marking data; collecting the current transformer real-time position parameters and the parameters of dissolved gas in oil, inputting the parameters into a trained model after processing, predicting and calculating the fault type of the diagnosis transformer by the model, and outputting the parameters and the fault type according to the current transformer position, and positioning the fault area.

Description

Transformer fault self-identification method and system based on two-way long and short-time memory

Technical Field

The invention relates to the technical field of transformer fault diagnosis, in particular to a transformer fault self-identification method and system based on two-way long-short-time memory.

Background

With the rapid development of modern industry and society, the safe and stable operation of a power system becomes a critical issue. The transformer is particularly important as a core device in a power system for fault diagnosis and prediction. Traditional transformer fault diagnosis methods are mainly based on physical and chemical analysis, such as dissolved gas analysis, DGA.

In recent years, with successful application of data-driven methods in many fields, machine learning and deep learning techniques have also begun to be applied in diagnosis of transformer faults. Common methods include support vector machines, decision trees, neural networks, and the like.

For example, chinese patent publication No. CN106443310B discloses a transformer fault detection method based on SOM neural network, which includes: selecting a transformer as a test object, and collecting transformer vibration signals in different states as sample data; decomposing and decomposing by utilizing the ensemble empirical mode in Hilbert-Huang transform to extract a feature vector; inputting the feature vector into the SOM neural network; calculating the weight of the mapping layer and the distance of the input vector; adjusting weights of winning neurons and adjacent neurons; judging whether a preset condition is reached, completing SOM neural network training, and obtaining a test sample; the test sample is input, and the transformer can be detected on line according to the fault type of the transformer corresponding to the network output test sample.

The prior art has some obvious drawbacks and disadvantages, such as: the data dependence is strong: many existing machine learning methods require retraining in the face of different data sets or different operating conditions, resulting in significant computational and time overhead. The ability to process time series data is insufficient: the operating state and faults of transformers are often time dependent and require analysis of long time series data.

Traditional machine learning methods have difficulty capturing these timing dependent model generalizations poorly: in many cases, the trained models have limited predictive power for new, unseen data. The computational resource requirement is high: training of deep learning models, especially on large data, requires a significant amount of computational resources and time.

Disclosure of Invention

The invention aims to provide a transformer fault self-identification method and system based on bidirectional long and short-time memory, which are used for solving the problems of strong data dependence, insufficient time sequence data processing capability, poor model generalization capability and high calculation resource requirement in the existing transformer fault diagnosis in the background technology.

In order to achieve the above purpose, the present invention provides the following technical solutions: a transformer fault self-identification method based on two-way long and short-time memory comprises the following specific steps:

s1: collecting transformer fault data, namely collecting dissolved gas parameters in oil and corresponding transformer fault conditions when the transformer fails from a historical operation database;

s2: preprocessing data, namely segmenting the collected data by using a sliding window technology, carrying out normalization processing, and extracting fault characteristic information to obtain an input matrix;

s3: dividing the preprocessed data into a training set and a testing set for training and testing the deep learning framework;

s4: constructing a deep learning framework based on a Bi-directional long and short Bi-LSTM model, converting the deep learning framework into a meta learner, performing meta learning optimization on the model by using a training set, and performing fine tuning on the model by using a small amount of marking data;

s5: the method comprises the steps of collecting real-time position parameters of a current transformer and dissolved gas parameters in oil, segmenting data through a sliding window technology, normalizing the data, inputting the data into a trained model, predicting and calculating the model, diagnosing the fault type of the transformer, and outputting the model.

S6: and locating the fault area in time according to the current transformer position parameters and the fault type.

Preferably, collecting the dissolved gas parameters and the corresponding transformer fault conditions at the time of the transformer fault in step S1 includes:

and in a transformer fault interval, collecting dissolved gas in the transformer oil at a fixed frequency, classifying and marking dissolved gas parameters by using a transformer fault type, and extracting fault characteristics by a three-ratio method to obtain a dissolved gas parameter-transformer fault type information sequence.

Preferably, the fault data collected in the step S1 are different in sampling time and sampling frequency due to different transformer fault periods each time, and the data length=sampling time/sampling frequency, and the sliding window technology in the step S2 determines the window size, that is, the required sub-sequence length, through the data features, starts from the starting position of each sequence, moves the window in fixed steps, and cuts the sub-sequence at each window position.

Preferably, in the step S2, the normalization processing mode is standard deviation normalization processing is performed on the input data, and a processing formula is as follows:

in the formula g _i，j For raw dissolved gas parameter data, L _i，j For the normalized dissolved gas parameter data, E (g _i，j ) For the raw dissolved gas parameter data mean, D (g _i，j ) Standard deviation of the original dissolved gas parameters.

Preferably, the step S4 is a method for constructing a Bi-directional long-short Bi-LSTM model based deep learning framework, which comprises the following steps:

constructing a five-layer network comprising: the system comprises two bidirectional LSTM layers, a full-connection layer, a Dropout layer and an output full-connection layer, wherein the number of LSTM units of the first bidirectional LSTM layer is 128, the whole sequence output result is returned, the second LSTM layer returns the last time step to output, the full-connection layer consists of 128 neurons, a ReLU activation function is used, and the output layer uses a Sigmoid activation function;

LSTM model inputs X from t moment _t Cell state C _t Temporary cell State C _t ' hidden layer state h _t Forget gate f (t), memory gate i (t) and output gate o (t);

the LSTM calculation process is that the information useful for the calculation of the subsequent moment is transmitted by forgetting the information in the cell state and memorizing the new information, the useless information is discarded, and the hidden layer state h is output in each time step _t Wherein forgetting, memorizing and outputting the hidden layer state h from the last moment _t-1 And current input X _t Calculated forget gate f (t), memory gate i (t) and output gate o (t)) Controlling;

the specific calculation process is as follows:

f(t)＝σ(W _f h _t-1 +U _f x _t +b _f )

i(t)＝σ(W _i h _t-1 +U _i x _t +b _i )

a(t)＝tanh(W _a h _t-1 +U _a x _t +b _a )

o(t)＝σ(W _o h _t-1 +U _o x _t +b _o )

wherein X is _t Indicating t time input, h _t-1 Representing a state value of a hidden layer at the moment t-1; w (W) _f 、W _i 、W _o And W is _a Respectively representing forgetting gate, input gate, output gate and h in the characteristic extraction process _t-1 Weight coefficient of (2); u (U) _f 、U _i 、U _o And U _a Respectively representing forgetting gate, input gate, output gate and X in characteristic extraction process _t Weight coefficient of (2); b _f 、b _i 、b _o And b _a Respectively representing the bias values of the forgetting gate, the input gate, the output gate and the characteristic extraction process; tan h represents a tangent hyperbolic function, σ represents a sigmoid activation function;

forget gate and input gate apply the calculation result to C _t-1 Cell past state, constituting cell state C at time t _t The method specifically comprises the following steps:

C _t ＝C _t-1 ⊙f(t)+i(t)⊙a(t)

wherein, as follows, the Hadamard product is as follows, and the hidden layer state h is at the moment t _t From the output gate o (t) and the current cell state C _t The following steps are obtained:

h _t ＝O(t)⊙tanh(C _t )

the fully connected layer will take the LSTM output as the input vector, output using the ReLU function:

out＝ReLU(W _d ·X+b _d )

wherein: x represents LSTM output, W _d B is a weight matrix of a full connection layer _d For a full connection layer bias matrix,out is the output of the full connection layer, and ReLU is the activation function;

the Dropout layer randomly sets the input unit to 0 according to the probability of 0.2, so that the deep learning framework is prevented from being over fitted;

outputting final output of the full connection layer:

y＝sigmoid(W _y ·X ₁ +b _y )

wherein: x is X ₁ Represents the output of the full connection layer, W _y B, in order to output the weight matrix of the full connection layer _y The method is characterized in that the method is a full-connection layer bias matrix, y is full-connection layer output, and sigmoid is an activation function;

the loss function is a two-class cross entropy:

Loss＝-(y·log(y ^∧ )+(1-y)·log(1-y ^∧ ))

wherein: y is ^∧ Predicting sample positive probability for the model, y being a sample label;

and obtaining the output deviation of each layer of unit through counter propagation calculation, calculating the gradient of each parameter by the deviation, and carrying out iterative correction by using an Adam optimization algorithm to reduce the gradient so as to continuously reduce the Loss value.

Preferably, in the step S4, the deep learning framework is converted into a meta learner, the training set is used to perform meta learning optimization on the model, and the specific method for performing fine tuning on the model by using a small amount of marking data is as follows:

expanding dimensions of an input data training set and a test set to adapt to the input requirement of a neural network, and constructing the training set and the test set into a metadata set, so that each data is a part of the metadata set;

converting an original deep learning model into a meta-optimizer by using a KerasModel of TensorFlow, initializing a current iteration average loss rate item_error and an accuracy rate item_acc, controlling single input data quantity item, cloning the original deep learning model during training, and circularly training a cloning model to adapt to the characteristics of a current training task so as to finish parameter updating of the cloning model;

after parameter updating is completed, testing a clone model by using a test set to obtain average loss rate and accuracy, accumulating the average loss rate and the accuracy into an item_error and an item_acc, dividing the item_error and the item_acc by a single input data amount item to obtain an average value after the test set traverses the test, and dividing the item_error and the item_per_acc by the single input data amount item respectively;

and updating model parameters of the meta-optimizer to complete a meta-learning optimization process, wherein the whole meta-learning process uses a MAML algorithm to perform meta-learning optimization.

Preferably, the specific method for determining the type of the transformer fault corresponding to the dissolved gas in the step S5 by using the three-ratio method is as follows:

the model receives dissolved gas parameters input in real time, predicts and analyzes the dissolved gas by using the trained model, codes and compares the predicted dissolved gas in the oil by using a three-ratio method, predicts the fault type of the transformer, and outputs the fault type of the transformer.

A transformer fault self-identification system based on two-way long short-term memory, comprising:

the information acquisition module is used for acquiring the position of the transformer, acquiring the dissolved gas parameters in the transformer oil in real time, extracting characteristic information from the acquired dissolved gas parameters in the transformer oil by a three-ratio method to obtain an information sequence, and transmitting the information sequence to the next module;

the data preprocessing module is internally provided with a data preprocessing system and is used for receiving the information sequence transmitted by the information acquisition module, segmenting and normalizing the received data through the data preprocessing system and transmitting the preprocessed information sequence to the next module;

the data analysis module is internally provided with a deep learning network and is used for receiving the information sequence sent by the data preprocessing module, analyzing the information sequence and outputting the fault type of the transformer according to the dissolved gas ratio of the model predictive analysis;

and the positioning alarm module is used for timely positioning a fault area according to the current transformer position parameter and the fault type and outputting fault information and transformer position alarm.

A power grid control program deploys a transformer fault self-identification system based on two-way long and short-term memory into the power grid control program, so that the power grid control program runs to realize a transformer fault self-identification method based on the two-way long and short-term memory.

Compared with the prior art, the invention has the beneficial effects that:

1. the method has the advantages that the high-precision fault prediction with strong adaptability is realized, the Bi-directional long-short-term memory network model Bi-LSTM is used for processing and deeply analyzing the fault diagnosis time sequence data of the transformer, the fault diagnosis accuracy of the transformer is improved, and the meta-learning technology is used for enabling the system to be capable of carrying out quick fine adjustment on a very small amount of new data, so that the system can adapt to a new or unseen fault mode. This enables the system to maintain a high predictive performance in the face of new transformers or new faults.

2. The automated process, from data preprocessing to fault prediction and diagnosis, integrates all steps into one system, making the whole process fully automated. This not only saves a lot of manpower, but also ensures the immediacy and continuity of prediction and diagnosis.

3. The reliability and the stability of the power system are improved, and through timely fault early warning and accurate fault position positioning, operation and maintenance personnel are helped to take measures in advance, so that more serious grid accidents are avoided, the normal operation of the power system is ensured, and the reliability and the stability of the whole power system are improved. Therefore, not only can the loss caused by faults be reduced, but also the cost of inspection and maintenance can be effectively reduced.

Drawings

FIG. 1 is a schematic flow chart of a transformer fault self-identification method based on two-way long-short-time memory according to an embodiment of the invention;

FIG. 2 is a schematic diagram of a transformer fault self-identification system based on two-way long-short-term memory according to an embodiment of the present invention;

FIG. 3 is a detailed flowchart illustrating a transformer fault self-identification method based on two-way long-short-term memory according to an embodiment of the present invention;

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.

Referring to fig. 1, a transformer fault self-identification method based on two-way long-short time memory includes:

in a transformer fault interval, collecting dissolved gas in transformer oil at a fixed frequency, classifying and marking dissolved gas parameters by using a transformer fault type, extracting fault characteristics by a three-ratio method to obtain a dissolved gas parameter-transformer fault type information sequence, formulating a coding rule and a fault type judging method by referring to the national DL/T722-2000 'guide rules', extracting the ratio of C2H2/C2H4, CH4/H2 and C2H4/C2H6 dissolved gas, and coding, wherein the coding rule is shown in the following table:

table one:

after the gas codes are combined, the gas code combination is combined with the transformer fault type to obtain a code combination-transformer fault type table, as shown in a table II:

and (II) table:

the fault data collected in step S1 are different in sampling time and sampling frequency due to different transformer fault periods each time, the data length=sampling time/sampling frequency, the window size, i.e. the required sub-sequence length, is determined by the sliding window technique in step S2 through the data features, the window is moved by fixed steps from the starting position of each sequence, and the sub-sequence is segmented at each window position.

Carrying out normalization processing on the data segmented into subsequences, wherein the normalization processing mode is standard deviation normalization processing on input data, and the processing formula is as follows:

data＝pd.read_csv("data.csv")

X＝data[["feature1"，"feature2"，"..."]].values

y＝data[["gas1"，"gas2"，"gas3"]].values

Invoking the segmented subsequence, converting the subsequence into a NumPy array format, and carrying out standard deviation standardization processing on the feature vector to ensure that different features have the same scale so as to adapt to the standardization requirement of a model. Here, standard deviation normalization was performed on the feature variables using the Standard scaler class in sklearn. Preprocessing,

scaler＝StandardScaler().fit(X)

X＝scaler.transform(X)

x_train, x_test, y_train, y_test=train_test_split (X, y, test_size=0.2), the normalized feature array X and the target variable array y are divided into a training set and a test set using a train_test_split function. Where test_size=0.2 indicates that the proportion of the test set here is 20%.

the LSTM calculation process is that the information useful for the calculation of the subsequent moment is transmitted by forgetting the information in the cell state and memorizing the new information, the useless information is discarded, and the hidden layer state h is output in each time step _t Wherein forgetting, memorizing and outputting the hidden layer state h from the last moment _t-1 And current input X _t The calculated forget gate f (t), the memory gate i (t) and the output gate o (t) are controlled;

the specific calculation process is as follows:

f(t)＝σ(W _f h _t-1 +U _f x _t +b _f )

i(t)＝σ(W _i h _t-1 +U _i x _t +b _i )

a(t)＝tanh(W _a h _t-1 +U _a x _t +b _a )

o(t)＝σ(W _o h _t-1 +U _o x _t +b _o )

C _t ＝C _t-1 ⊙f(t)+i(t)⊙a(t)

h _t ＝O(t)⊙tanh(C _t )

out＝ReLU(W _d ·X+b _d )

wherein: x represents LSTM output, W _d B is a weight matrix of a full connection layer _d The matrix is a full-connection layer bias matrix, out is full-connection layer output, and ReLU is an activation function;

outputting final output of the full connection layer:

y＝sigmoid(W _y ·X ₁ +b _y )

the loss function is a two-class cross entropy:

Loss＝-(y·log(y ^∧ )+(1-y)·log(1-y ^∧ ))

obtaining the output deviation of each layer of unit through back propagation calculation, calculating the gradient of each parameter by the deviation, carrying out iterative correction by using an Adam optimization algorithm, reducing the gradient, leading the Loss value to be continuously reduced,

defcreate_model(input_shape):

model＝tf.keras.Sequential([

tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(128，return_sequences＝True)，input_shape＝input_shape)，

tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64))，

tf.keras.layers.Dense(128，activation＝'relu')，

tf.keras.layers.Dropout(0.2)，

tf.keras.layers.Dense(y_train.shape[1]，activation＝'sigmoid')])

returnmodel

model＝create_model((X_train.shape[1]，1))

model.compile(optimizer＝'adam'，loss＝'binary_crossentropy')

defining a creation model function create_model, receiving a parameter input_shape, designating a function input shape, wherein the model consists of multiple layers, and comprises two bidirectional LSTM layers, a full connection layer, a Dropout layer and an output layer;

configuring an optimization model and a loss function by using a common method, selecting an Adam optimizer, and designating the loss function as a two-class cross entropy loss function;

expanding dimensions of an input data training set and a test set through newaxis () function so as to adapt to the input requirement of a neural network, and constructing the training set and the test set into a metadata set, so that each data is a part of the metadata set;

model parameter updating is carried out on the meta-optimizer through an optimizer.step () function, a meta-learning optimization process is completed, the whole meta-learning process uses a MAML algorithm to carry out meta-learning optimization, and the following is a meta-learning part python code:

dataset＝l2l.data.TensorDataset(X_train，y_train)

meta_train_dataset＝l2l.data.MetaDataset(dataset)

meta_test_dataset＝l2l.data.MetaDataset(dataset)

meta_optimizer＝l2l.optim.KerasModel(model)

foriterationinrange(num_iterations):

iteration_error＝0.0

iteration_acc＝0.0

for_inrange(num_tasks_per_iteration):

learner＝model.clone()

task＝meta_train_dataset.sample_task()

task_dataset＝l2l.data.TensorDataset(*task)

for_inrange(adaptation_steps):

error＝learner.adapt(task_dataset)

evaluation_error ，evaluation_accuracy＝compute_loss_and_accuracy(learner ，meta_test_dataset.sample_task())

iteration_error+＝evaluation_error

iteration_acc+＝evaluation_accuracy

iteration_error/＝num_tasks_per_iteration

iteration_acc/＝num_tasks_per_iteration

print('Loss:{:.3f}Acc:{:.3f}'.format(iteration_error，iteration_acc))

meta_optimizer.step(iteration_error)；

the whole meta-learning process repeatedly executes num_iterations, and each iteration carries out parameter updating on the model to adapt to different tasks, so that the model is better generalized to a new task through meta-learning.

S5: collecting current transformer real-time position parameters and oil dissolved gas parameters, segmenting data through a sliding window technology, inputting the segmented data into a trained model after normalization treatment, predicting and calculating the fault type of the diagnosis transformer by the model, and outputting the fault type, wherein the specific diagnosis mode is as follows:

predicted_gases＝model.predict(X_test)

ratios＝{

"CH4/H2":predicted_gases[:，1]/predicted_gases[:，0]，

"C2H2/C2H4":predicted_gases[:，2]/predicted_gases[:，3]，

"C2H4/C2H6":predicted_gases[:，3]/predicted_gases[:，4]

}

fault_type＝diagnose_fault(ratios)

print(f"Predicted fault type:{fault_type}")

In a second embodiment, please refer to fig. 1-2.

the information acquisition module acquires dissolved gas in transformer oil at a fixed frequency in a transformer fault interval, classifies and marks dissolved gas parameters by using a transformer fault type, extracts fault characteristics by a three-ratio method to obtain a dissolved gas parameter-transformer fault type information sequence, formulates a coding rule and a fault type judgment method by referring to the national DL/T722-2000 guide, extracts the ratio of C2H2/C2H4, CH4/H2 and C2H4/C2H6 dissolved gas, and codes the coding rule as shown in the table three:

table three:

after the gas codes are combined, the gas code combination is combined with the transformer fault type to obtain a code combination-transformer fault type table, as shown in a table four:

table four:

the window size, i.e. the required sub-sequence length, is determined by the data characteristics, the window is moved by fixed steps starting from the start position of each sequence, and the sub-sequence is segmented at each window position.

data＝pd.read_csv("data.csv")

X＝data[["feature1"，"feature2"，"..."]].values

y＝data[["gas1"，"gas2"，"gas3"]].values

scaler＝StandardScaler().fit(X)

X＝scaler.transform(X)

dividing the preprocessed data into a training set and a testing set for training and testing the deep learning framework;

the specific calculation process is as follows:

f(t)＝σ(W _f h _t-1 +U _f x _t +b _f )

i(t)＝σ(W _i h _t-1 +U _i x _t +b _i )

a(t)＝tanh(W _a h _t-1 +U _a x _t +b _a )

o(t)＝σ(W _o h _t-1 +U _o x _t +b _o )

C _t ＝C _t-1 ⊙f(t)+i(t)⊙a(t)

h _t ＝O(t)⊙tanh(C _t )

out＝ReLU(W _d ·X+b _d )

outputting final output of the full connection layer:

y＝sigmoid(W _y ·X ₁ +b _y )

the loss function is a two-class cross entropy:

Loss＝-(y·log(y ^∧ )+(1-y)·log(1-y ^∧ ))

defcreate_model(input_shape):

model＝tf.keras.Sequential([

tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64))，

tf.keras.layers.Dense(128，activation＝'relu')，

tf.keras.layers.Dropout(0.2)，

tf.keras.layers.Dense(y_train.shape[1]，activation＝'sigmoid')

])

returnmodel

model＝create_model((X_train.shape[1]，1))

model.compile(optimizer＝'adam'，loss＝'binary_crossentropy')

dataset＝l2l.data.TensorDataset(X_train，y_train)

meta_train_dataset＝l2l.data.MetaDataset(dataset)

meta_test_dataset＝l2l.data.MetaDataset(dataset)

meta_optimizer＝l2l.optim.KerasModel(model)

foriterationinrange(num_iterations):

iteration_error＝0.0

iteration_acc＝0.0

for_inrange(num_tasks_per_iteration):

learner＝model.clone()

task＝meta_train_dataset.sample_task()

task_dataset＝l2l.data.TensorDataset(*task)

for_inrange(adaptation_steps):

error＝learner.adapt(task_dataset)

evaluation_error，evaluation_accuracy＝compute_loss_and_accuracy(learner，meta_test_dataset.sample_task())

iteration_error+＝evaluation_error

iteration_acc+＝evaluation_accuracy

iteration_error/＝num_tasks_per_iteration

iteration_acc/＝num_tasks_per_iteration

print('Loss:{:.3f}Acc:{:.3f}'.format(iteration_error，iteration_acc))

meta_optimizer.step(iteration_error)；

predicted_gases＝model.predict(X_test)

ratios＝{

"CH4/H2":predicted_gases[:，1]/predicted_gases[:，0]，

"C2H2/C2H4":predicted_gases[:，2]/predicted_gases[:，3]，

"C2H4/C2H6":predicted_gases[:，3]/predicted_gases[:，4]

}

fault_type＝diagnose_fault(ratios)

print(f"Predicted fault type:{fault_type}")

And inputting the gas parameters in the transformer oil acquired in real time into a data analysis module, predicting the ratio of dissolved gas in the oil, and outputting the fault type of the transformer.

An embodiment three, the present embodiment provides a power grid control program, and a transformer fault self-identification system based on bidirectional long and short time memory is deployed into the power grid control program, so that the power grid control program runs to implement a transformer fault self-identification method based on bidirectional long and short time memory, and the implementation and steps of the transformer fault self-identification system related to bidirectional long and short time memory in the embodiment two can be referred to by the transformer fault self-identification system based on bidirectional long and short time memory in the power grid control program.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The transformer fault self-identification method based on the two-way long and short-time memory is characterized by comprising the following specific steps of:

2. The transformer fault self-identification method based on the two-way long-short-time memory according to claim 1, wherein the method comprises the following steps of: the step S1 of collecting the dissolved gas parameters and the corresponding transformer fault conditions during the transformer fault comprises:

3. The transformer fault self-identification method based on the two-way long-short-time memory according to claim 2, wherein the method comprises the following steps: the fault data collected in the step S1 are different in sampling time and sampling frequency due to different fault periods of the transformer each time, and the data length=sampling time/sampling frequency, and the sliding window technology in the step S2 determines the window size, that is, the required sub-sequence length, through the data features, starts from the starting position of each sequence, moves the window with fixed steps, and cuts the sub-sequence at each window position.

4. The transformer fault self-identification method based on the two-way long-short-time memory according to claim 3, wherein the method comprises the following steps of: the normalization processing mode in the step S2 is to perform standard deviation normalization processing on the input data, and the processing formula is as follows:

5. The transformer fault self-identification method based on the two-way long-short-time memory according to claim 4, wherein the method comprises the following steps: the method for constructing the Bi-directional long and short Bi-LSTM model based deep learning framework in the step S4 comprises the following steps:

the LSTM calculation process is that the information useful for the calculation of the subsequent moment is transmitted by forgetting the information in the cell state and memorizing the new information, the useless information is discarded, and the hidden layer state h is output in each time step _t Wherein forgetting, memorizing and outputting the hidden layer state h from the last moment _t-1 And current input X _t Calculated forgetting gate f (t), memory gate i (t) and inputA gate o (t) is controlled;

the specific calculation process is as follows:

f(t)＝σ(W _f h _t-1 +U _f x _t +b _f )

i(t)＝σ(W _i h _t-1 +U _i x _t +b _i )

a(t)＝tanh(W _a h _t-1 +U _a x _t +b _a )

o(t)＝σ(W _o h _t-1 +U _o x _t +b _o )

C _t ＝C _t-1 ⊙f(t)+i(t)⊙a(t)

h _t ＝O(t)⊙tanh(C _t )

out＝ReLU(W _d ·X+b _d )

wherein: x represents LSTM output, W _d B is a weight matrix of a full connection layer _d Bias matrix for full connection layer, out for full connectionLayer output, reLU is an activation function;

outputting final output of the full connection layer:

y＝sigmoid(W _y ·X ₁ +b _y )

the loss function is a two-class cross entropy:

Loss＝-(y·log(y ^∧ )+(1-y)·log(1-y ^∧ ))

6. The transformer fault self-identification method based on the two-way long-short-time memory according to claim 5, wherein the method comprises the following steps of: in the step S4, the deep learning framework is converted into a meta learner, the training set is used to perform meta learning optimization on the model, and the specific method for performing fine tuning on the model by using a small amount of marking data is as follows:

7. The method for self-identifying faults of transformers based on two-way long-short-term memory according to claim 6, wherein the model prediction calculation in the step S5 is performed to diagnose and output the fault types of transformers, and the specific method is as follows:

8. A transformer fault self-identification system based on two-way long and short-term memory, which is realized based on the transformer fault self-identification method based on two-way long and short-term memory as claimed in claim 7, and is characterized by comprising the following steps:

9. A power grid control program, characterized in that the transformer fault self-identification system based on the bidirectional long and short time memory according to claim 8 is deployed into the power grid control program, so that the power grid control program runs to realize the transformer fault self-identification method based on the bidirectional long and short time memory according to any one of claims 1 to 7.