CN113707176A

CN113707176A - Transformer fault detection method based on acoustic signal and deep learning technology

Info

Publication number: CN113707176A
Application number: CN202111026413.3A
Authority: CN
Inventors: 程汪刘; 倪修峰; 童旸; 鲍文霞; 曹成功; 高志国; 李飞; 卢俊结
Original assignee: Anhui Nanrui Jiyuan Power Grid Technology Co ltd; Anhui University; Tongling Power Supply Co of State Grid Anhui Electric Power Co Ltd
Current assignee: Anhui Nanrui Jiyuan Power Grid Technology Co ltd; Anhui University; Tongling Power Supply Co of State Grid Anhui Electric Power Co Ltd
Priority date: 2021-09-02
Filing date: 2021-09-02
Publication date: 2021-11-26
Anticipated expiration: 2041-09-02
Also published as: CN113707176B

Abstract

The invention relates to a transformer fault detection method based on acoustic signals and a deep learning technology, and compared with the prior art, the transformer fault detection method overcomes the defect that accurate detection of transformer faults is difficult to perform by using voiceprint signals. The invention comprises the following steps: acquiring sound data of the power transformer; preprocessing acoustic signals in a training sample set; extracting sound characteristics of the sound signal data; constructing a transformer fault detection model; training a transformer fault detection model; acquiring and preprocessing acoustic signal data of a transformer to be detected; and obtaining a fault detection result of the transformer to be detected. The invention can detect the fault of the transformer based on the voiceprint signal.

Description

Transformer fault detection method based on acoustic signal and deep learning technology

Technical Field

The invention relates to the technical field of transformer fault detection, in particular to a transformer fault detection method based on an acoustic signal and deep learning technology.

Background

The transformer in the power grid has the characteristics of large usage amount, various types and specifications, and long service time, which causes the frequency of the transformer in the power grid system to be over high. According to statistics, about 5 transformers fail every 200 transformers with a run time of more than 4 years. Therefore, troubleshooting and repairing the transformer becomes an important procedure for the stable operation of the power grid.

For the fault elimination of the transformer, the traditional processing method needs to go to a place to be eliminated manually, and whether the transformer has a fault is diagnosed according to the sound emitted by the transformer by means of manual experience. This method not only needs a lot of time and effort, but also is interfered by human factors, and may cause fault diagnosis errors.

In recent years, deep learning is increasingly advanced, and the signal processing method has the characteristic of high efficiency. If the sound of the transformer can be processed in a deep learning mode, the characteristics of the sound emitted by the transformer can be automatically analyzed, and the characteristics are classified, so that the transformer with the fault can be quickly diagnosed, the labor cost can be greatly reduced, and the stability of a power grid can be maintained more quickly.

In the prior art, although there are some voiceprint detection methods, most of them use a gaussian mixture model and a hidden markov model as effective acoustic models to perform voiceprint detection, after the development of deep learning, the methods include convolutional Neural networks, self-encoders, Recurrent Neural Networks (RNNs), and more deep learning networks are applied in the field of voice detection. From the indexes of accuracy rate, accuracy and the like of sound detection, the voiceprint detection method based on deep learning is far better than the voiceprint detection method of the traditional acoustic model, but vibration data are mostly adopted in the voiceprint detection method based on deep learning in the prior art, and audio data are not utilized for fault detection.

Therefore, how to better apply the deep learning technology to the transformer voiceprint fault detection has become an urgent technical problem to be solved.

Disclosure of Invention

The invention aims to solve the defect that the transformer fault is difficult to accurately detect by using a voiceprint signal in the prior art, and provides a transformer fault detection method based on an acoustic signal and deep learning technology to solve the problem.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a transformer fault detection method based on an acoustic signal and deep learning technology comprises the following steps:

11) acquiring sound data of the power transformer: acquiring sound data of the transformer by a voiceprint acquisition sensor in a field, marking the sound data into a normal class and a fault class, and defining the sound data as a training sample set;

12) preprocessing acoustic signals in a training sample set: carrying out denoising pretreatment on the collected power transformer sound data by using a pretreatment method of segmentation, framing, sound windowing and a self-adaptive filtering method; carrying out data enhancement on the acoustic signal through cutting, noise adding and tuning;

13) sound feature extraction of sound signal data: extracting sound features of the preprocessed power transformer sound data by adopting a Mel cepstrum coefficient, and extracting an MFCC coefficient;

14) constructing a transformer fault detection model: constructing a transformer fault detection model based on the characteristics of the double-gating convolution network model and the transformer acoustic signal;

15) training a transformer fault detection model: inputting the extracted MFCC coefficients into a transformer fault detection model for training;

16) acquiring and preprocessing acoustic signal data of the transformer to be detected: acquiring transformer acoustic signal data to be detected, performing denoising pretreatment, and extracting MFCC coefficients from the transformer acoustic signal data to be detected after the denoising pretreatment;

17) obtaining a fault detection result of the transformer to be detected: and inputting the MFCC coefficient into the trained transformer fault detection model to obtain a transformer fault detection result.

The pre-processing of acoustic signals within the training sample set comprises the steps of:

21) carrying out segmentation operation on the collected sound data s (t) of the power transformer:

the obtained power transformer sound data is segmented,

is divided into s (t) ═ s₁(t),s₂(t),...,s_q(t),...s_r(t) }, calculating the total length L of the voiceprint data, and the calculation formula is as follows:

L＝time×fS_Sample＝r×r_L，

wherein fs is_sampleFor the sampling frequency of the sound, time is the sampling time, r is the number of segments, r_LIs the segment length;

22) for segmented transformer sound data s_q(t) performing framing:

setting the voiceprint frame length of the transformer to be 500ms, and performing framing processing to obtain

s_q(t)＝{s_q1(t),s_q2(t),...,s_qp(t),...s_qLength(t)}，

Wherein, the Length of each frame is set to 500ms, and each segment is divided into a Length frame;

23) windowing the sound of the transformer after framing:

windowing the frame data by end point smoothing, and windowing the frame by using a Hamming window, wherein a function w (t) of the Hamming window is represented as follows:

wherein M is the frame length and t is the time;

obtaining a time domain signal f for each frame_qp(t), the expression of which is as follows:

f_qp(t)＝s_qp(t)*w(n)，

wherein f is_qp(t) is the time domain signal of the p frame of the q-th segment signal, w (n) is the windowing function, s_qp(t) is a signal value of a p frame of the q-th segment signal;

24) denoising the windowed sound by using an adaptive filter:

to f_qp(t) sampling to obtain X_i(n) digital signal sequence, and setting initial filter values as follows:

wherein, W (k) is the optimal weight coefficient, mu is the convergence factor, and lambda is the maximum eigenvalue of the correlation matrix;

calculating an actual output estimate of the filter output:

y(k)＝W^T(k)Xi(n)，

where y (k) is the actual output estimate, W^T(k) For transposition of optimal weight coefficients, X_i(n) is an input signal sequence;

e (k) is an error signal, for calculating the error:

e(k)＝d(k)-y(k)，

where d (k) is the desired output value, the filter coefficients at time k +1 are updated:

W(k+1)＝W(k)+μe(k)X(k)，

wherein, W (k +1) is the optimal weight coefficient at the time of k +1, W (k) is the optimal weight coefficient at the time of k, e (k) is the error at the time of k, and x (k) is the input sequence at the time of k;

continuously iterating and solving by using a steepest descent method to minimize an error signal, and obtaining output y (k) after adaptive filtering and denoising;

25) and (k) performing data enhancement on the acoustic signal through cutting, noise adding and tuning processing on the output y (k) of the adaptive filtering and denoising.

The sound feature extraction of the sound signal data comprises the following steps:

31) and (c) performing inverse transformation on the y (k) subjected to data enhancement to generate s (t), and performing pre-emphasis operation on s (t), wherein the calculation formula is as follows:

y(z)＝s(z)H(z)，

where y (Z) is the output of the pre-emphasis, s (Z) is the Z-domain representation of the signal s (t), h (Z) is the transfer function of the high-pass filter:

the high-pass filter is in the z-domain, where μ has a value between 0.9 and 1.0;

32) for pre-emphasized transformer sound data s_q(t) performing framing:

setting the length of a transformer voiceprint frame as 30ms, performing framing processing on the transformer voice data, and setting the processed structure as each frame for 30 ms;

33) windowing the framed sound data: windowing is carried out on the frames by using a hamming window, and if the output obtained by the previous two steps of preprocessing is S (n) and the window function is W (n), the calculation formula is as follows:

S'(n)＝S(n)×W(n)；

wherein the window function W (n) is:

in the formula, N represents N time, the value range of N is more than or equal to 0 and less than or equal to N-1, N is the number of sampling points, different Hamming windows can be obtained according to different a values, N represents a sequence independent variable, and a is a set constant;

34) and performing fast Fourier transform on the windowed data:

performing FFT on each frame signal to obtain frequency domain signal x (k):

where x (k) represents the frequency domain output, x (N) represents the time domain input, and N is the number of sampling points;

35) carrying out Mel filtering on the data after the fast Fourier transform, wherein the conversion formula of the Mel filtering is as follows:

wherein f represents a physical frequency, mel (f) represents a mel frequency; filtering the obtained Mel frequency through M Mel-scale triangular filter banks, wherein the formula of the filter bank is as follows:

in this formula:

and then calculating the logarithmic energy output by each filter bank:

wherein E (m) is logarithmic energy, H_m(k) Is a filter bank;

36) performing cepstrum analysis on the Mel filtered data to extract MFCC coefficients C (n), wherein the MFCC coefficients C (n) are extracted by performing Discrete Cosine Transform (DCT):

where M represents the number of filters.

The method for constructing the transformer fault detection model comprises the following steps:

41) setting a transformer fault detection model based on a double-gate convolution neural network, wherein the transformer fault detection model comprises two convolution gate control layers, two pooling layers, a full-connection layer and an output layer;

42) setting a convolution gating layer: the convolution gating layer extracts features through convolution operation of input data and convolution kernels to obtain a feature map, and the depth of the feature map depends on the number of set convolution kernels;

assume that the input is X ∈ R^A×BWhere a and B represent the length and width of the input data, respectively, the convolution operation is defined as:

wherein:

represents the jth characteristic plan of the first convolutional layer,

represents the ith plane characteristic diagram of the (l-1) th convolution layer, M represents the number of the plane characteristic diagrams,

which represents the convolution kernel or kernels, is,

which represents the offset of the bias voltage,

and

the specific parameters are determined by training optimization, f represents a excitation layer function, common excitation functions include ReLu and sigmoid functions, and the expression is as follows:

the output of the gating is represented as:

wherein: h (X) denotes the output of Gated CNN, W and V denote different convolution kernels, b and c denote different biases,

expressing the corresponding multiplication of elements of the matrix, expressing sigma to represent sigmoid gating function, and expressing X to represent the output of the previous layer;

43) designing a pooling layer: setting a pooling layer by adopting a maximum pooling or average pooling method;

44) and building a full connection layer and adopting a softmax classifier as an output layer for final classification output.

The training of the transformer fault detection model comprises the following steps:

51) inputting the MFCC coefficients extracted from the training sample set into a transformer fault detection model;

52) and continuously iterating forward propagation and backward propagation of the MFCC coefficients in the transformer fault detection model, and performing parameter optimization to obtain the trained transformer fault detection model.

Advantageous effects

Compared with the prior art, the transformer fault detection method based on the acoustic signal and the deep learning technology can detect the fault of the transformer based on the voiceprint signal (non-vibration signal); the method comprises the steps of constructing a transformer working condition sound data set by collecting transformer sounds and classifying and marking the transformer sounds as 'normal' and 'fault'; for the interference of environmental noise, preprocessing a transformer working condition sound data set for denoising; in consideration of the practicability of the training model, in order to enhance the robustness of a subsequently built neural network, a plurality of lines of data enhancement operations are carried out on data; then, extracting the sound data set of the transformer through MFCC characteristics and sending the sound data set into a network for training; when the basic CNN network is used for fault detection with complex conditions, the phenomena of low training efficiency and overfitting exist due to the fact that the audio data volume is small or the audio duration is long, and the detection accuracy is low.

Aiming at the transformer fault detection effect, the invention designs a double-gate convolution neural network transformer sound detection model, and provides a favorable method basis and detection model realization for realizing transformer fault on-line monitoring.

Drawings

FIG. 1 is a sequence diagram of the method of the present invention;

FIG. 2 is a logic flow diagram of a method implementation of the present invention;

fig. 3 is a structural diagram of a double-gated convolutional neural network according to the present invention.

Detailed Description

So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:

as shown in fig. 1 and fig. 2, the method for detecting a transformer fault based on an acoustic signal and a deep learning technique according to the present invention includes the following steps:

firstly, acquiring sound data of the power transformer. The sound data of the transformer is acquired through the field acquisition of the voiceprint acquisition sensor, and is divided into a normal class and a fault class through marking, and the normal class and the fault class are defined as a training sample set. In a laboratory link, the acquired sound data of the transformer can be divided into two types, namely a normal working state and a fault working state, an experiment database is established, and a training set and a test set of the experiment database are divided according to a certain proportion.

And secondly, pre-processing the acoustic signals in the training sample set. Carrying out denoising pretreatment on the collected power transformer sound data by using a pretreatment method of segmentation, framing, sound windowing and a self-adaptive filtering method; and then carrying out data enhancement on the acoustic signal through cutting, noise adding and tuning processing. The method comprises the following specific steps:

(1) carrying out segmentation operation on the collected sound data s (t) of the power transformer:

the obtained power transformer sound data is segmented,

L＝time×fS_Sample＝r×r_L，

wherein fs is_sampleFor the sampling frequency of the sound, time is the sampling time, r is the number of segments, r_LIs the segment length.

(2) For segmented transformer sound data s_q(t) performing framing:

s_q(t)＝{s_q1(t),s_q2(t),...,s_qp(t),...s_qLength(t)}，

Wherein, the Length of each frame is set as 500ms, and each segment is divided into a Length frame.

(3) Windowing the sound of the transformer after framing:

wherein M is the frame length and t is the time;

obtaining a time domain for each frameSignal f_qp(t), the expression of which is as follows:

f_qp(t)＝s_qp(t)*w(n)，

wherein f is_qp(t) is the time domain signal of the p frame of the q-th segment signal, w (n) is the windowing function, s_qp(t) is a signal value of a p frame of the q-th segment signal.

(4) Denoising the windowed sound by using an adaptive filter:

calculating an actual output estimate of the filter output:

y(k)＝W^T(k)Xi(n)，

e (k) is an error signal, for calculating the error:

e(k)＝d(k)-y(k)，

W(k+1)＝W(k)+μe(k)X(k)，

and continuously iterating and solving by using a steepest descent method to minimize the error signal and obtain the output y (k) after the adaptive filtering and denoising.

(5) And (k) performing data enhancement on the acoustic signal through cutting, noise adding and tuning processing on the output y (k) of the adaptive filtering and denoising.

Thirdly, extracting the sound characteristics of the sound signal data: as with speech recognition, there are two methods of application for audio data during fault detection, the first is to directly input a one-dimensional signal for processing, and the second is to convert the audio data into an image to extract audio features for subsequent processing, where one of the most commonly used features is the Mel-scale Frequency Coefficients, MFCC for short. The sound feature extraction is carried out on the preprocessed power transformer sound data by adopting a Maillard cepstrum coefficient, an MFCC coefficient is extracted, and the extracted MFCC coefficient is sent to a network at the later stage to generate a feature map. The method comprises the following specific steps:

(1) and (c) performing inverse transformation on the y (k) subjected to data enhancement to generate s (t), and performing pre-emphasis operation on s (t), wherein the calculation formula is as follows:

y(z)＝s(z)H(z)，

the high pass filter is in the z-domain, where μ has a value between 0.9 and 1.0. The sound data passes through a high-pass filter, so that the energy of a high-frequency part is improved, the energy of the high-frequency part is kept consistent with that of a low-frequency part as much as possible, and the frequency spectrum can be obtained by the same signal-to-noise ratio in the whole frequency band.

(2) For pre-emphasized transformer sound data s_q(t) performing framing:

the length of the transformer voiceprint frame is set to be 30ms, the framing processing of the transformer voice data is carried out, and the structure after the processing is 30ms of each frame.

(3) Windowing the framed sound data. The purpose of the windowing function is to increase the continuity of the left and right ends of each frame of sound by multiplying each frame of sound by the windowing function.

Windowing is carried out on the frames by using a hamming window, and if the output obtained by the previous two steps of preprocessing is S (n) and the window function is W (n), the calculation formula is as follows:

S'(n)＝S(n)×W(n)；

wherein the window function W (n) is:

in the formula, N represents N time, the value range of N is more than or equal to 0 and less than or equal to N-1, N is the number of sampling points, different Hamming windows can be obtained according to different a values, N represents a sequence independent variable, and a is a set constant.

(4) And carrying out fast Fourier transform on the windowed data. Fast Fourier Transform (FFT) is a common way to transform a time-domain signal into the frequency domain.

Performing FFT on each frame signal to obtain frequency domain signal x (k):

where x (k) represents the frequency domain output, x (N) represents the time domain input, and N is the number of sample points.

(5) Carrying out Mel filtering on the data after the fast Fourier transform, wherein the conversion formula of the Mel filtering is as follows:

in this formula:

and then calculating the logarithmic energy output by each filter bank:

wherein E (m) is logarithmic energy, H_m(k) Is a filter bank.

(6) Performing cepstrum analysis on the data after the Mel filtering to extract an MFCC coefficient C (n), wherein the MFCC coefficient is extracted by performing Discrete Cosine Transform (DCT):

where M represents the number of filters. The larger M, the more feature values are extracted per frame, the more information amount, and thus the more accurately the signal can be described.

Fourthly, constructing a transformer fault detection model: and constructing a transformer fault detection model based on the characteristics of the double-gating convolution network model and the transformer acoustic signal.

The transformer acoustic signal can reflect the working condition of the transformer, and the traditional manual troubleshooting of the transformer needs to be carried out on-site acoustic detection manually with experience. The mode for detecting the transformer fault has certain hysteresis, namely the transformer at a certain position needs to be checked one by one through experienced manual intervention after the transformer is in fault under normal conditions. Manpower with abundant transformer troubleshooting experience needs years of actual operation and cultivation, and a large amount of time and cost are consumed. The transformer on-line monitoring system can quickly and conveniently monitor faults by using a deep learning method, the deep learning method is similar to the traditional fault detection method at present, most of the deep learning method still uses vibration data, and the research on fault detection by only using audio data is less. Compared with vibration data, the audio data is easier to obtain and low in cost, and is more suitable for large-scale practical application. Therefore, the invention provides a method for detecting the fault by using the audio data of the transformer based on the double-gate convolution neural network, and designs a model for detecting and identifying the fault of the transformer based on the acoustic signal.

Because the transformer is usually placed outdoors, the acoustic signal of the transformer often has great noise, and when the energy of the noise is too large, the real acoustic signal sent by the local machine of the transformer is submerged, so that the trained detection model has low effect. The invention carries out denoising pretreatment on the collected acoustic signals in the previous steps, reduces the noise energy by adopting a self-adaptive filtering method, and reserves the local acoustic signals of the transformer. On the premise of removing noise, the sound data of the transformer adopts a plurality of data enhancement modes to expand the data set samples, so that a sample set with more complex conditions is input when the network is trained, and the robustness of a transformer detection model is enhanced.

For the input processing of the transformer audio data, if the audio is directly input into the network training in a one-dimensional mode, the memory required by the training is too large, and the physical realization is difficult. If the voice data in the one-dimensional mode is used as a time domain signal and is directly input into the network, the network can only extract a small part of feature information, which is not beneficial to the training of the network model. Therefore, before data is input into the network, an MFCC coefficient feature extraction mode is adopted to convert a time domain signal into a time frequency signal, and the time frequency signal is input into the network in a two-dimensional signal mode. The MFCC coefficients are time-frequency signals and contain more characteristic information, so that the network can extract more effective characteristics, and the trained model is better.

The general convolutional neural network is affected by the audio format, so that the training efficiency is low and the overfitting phenomenon often occurs, and the fault detection accuracy is low. Different from general convolutional neural network, the invention adopts the network structure idea of gate-controlled convolution and adds a gate-controlled switch to the output of the convolutional network. The gate control switch is used for realizing a buffer control effect when information is transmitted in training, so that the overfitting phenomenon is solved and the training efficiency is improved. Aiming at the characteristics of the sound of the transformer, the invention adds a convolution gating layer on the basis of a single gating convolution network and designs a double gating convolution neural network. The double-gated convolutional neural network adopts two convolutional gating layers, so that deep level characteristics of the sound of the transformer local machine can be effectively extracted, the training effect is improved, and a training model with a good effect is obtained.

The method comprises the following specific steps:

(1) as shown in fig. 3, a transformer fault detection model is set based on a double-gated convolutional neural network, which includes two convolutional gating layers, two pooling layers, a full-link layer, and an output layer.

(2) Setting a convolution gating layer: the convolution gating layer extracts features through convolution operation of input data and convolution kernels to obtain a feature map, and the depth of the feature map depends on the number of set convolution kernels;

wherein:

represents the jth characteristic plan of the first convolutional layer,

which represents the convolution kernel or kernels, is,

which represents the offset of the bias voltage,

and

is determined by training optimization, f represents the excitation layer function, the excitation in common useThe functions include ReLu and sigmoid functions, and the expression is as follows:

the output of the gating is represented as:

and elements representing the matrix are multiplied correspondingly, sigma represents a sigmoid gating function, and X represents the output of the previous layer. Through the gate control structure, the gradient dispersion of the model is reduced, the training of the model is promoted, the structure of the model is simplified, and meanwhile, the nonlinear representation capability of the model is reserved;

(3) designing a pooling layer: when the convolution kernel is small, the feature map after convolution layer is still large, at this time, the dimensionality reduction needs to be performed through pooling operation, the number of features is reduced, the robustness of the network to the input features is enhanced, and the pooling layer is set by adopting a maximum pooling or average pooling method.

(4) And building a full connection layer and adopting a softmax classifier as an output layer for final classification output.

And fifthly, training a transformer fault detection model: and inputting the extracted MFCC coefficients into a transformer fault detection model for training. The method comprises the following specific steps:

(1) inputting the MFCC coefficients extracted from the training sample set into a transformer fault detection model;

(2) and continuously iterating forward propagation and backward propagation of the MFCC coefficients in the transformer fault detection model, and performing parameter optimization to obtain the trained transformer fault detection model.

Sixthly, acquiring and preprocessing the acoustic signal data of the transformer to be detected: acquiring transformer acoustic signal data to be detected, performing denoising pretreatment, and extracting MFCC coefficients from the transformer acoustic signal data to be detected after the denoising pretreatment.

Seventhly, obtaining a fault detection result of the transformer to be detected: and inputting the MFCC coefficient into the trained transformer fault detection model to obtain a transformer fault detection result.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A transformer fault detection method based on an acoustic signal and deep learning technology is characterized by comprising the following steps:

2. The method for detecting the fault of the transformer based on the acoustic signal and the deep learning technology as claimed in claim 1, wherein the pre-processing of the acoustic signal in the training sample set comprises the following steps:

the obtained power transformer sound data is segmented,

L＝time×fS_Sample＝r×r_L，

22) for segmented transformer sound data s_q(t) performing framing:

s_q(t)＝{s_q1(t),s_q2(t),...,s_qp(t),...s_qLength(t)}，

23) windowing the sound of the transformer after framing:

wherein M is the frame length and t is the time;

f_qp(t)＝s_qp(t)*w(n)，

24) denoising the windowed sound by using an adaptive filter:

calculating an actual output estimate of the filter output:

y(k)＝W^T(k)Xi(n)，

e (k) is an error signal, for calculating the error:

e(k)＝d(k)-y(k)，

W(k+1)＝W(k)+μe(k)X(k)，

3. The transformer fault detection method based on the acoustic signal and deep learning technology as claimed in claim 1, wherein the acoustic feature extraction of the acoustic signal data comprises the following steps:

y(z)＝s(z)H(z)，

H(Z)＝1-|μz^-1，

32) for pre-emphasized transformer sound data s_q(t) performing framing:

S'(n)＝S(n)×W(n)；

wherein the window function W (n) is:

34) and performing fast Fourier transform on the windowed data:

performing FFT on each frame signal to obtain a frequency domain signal w (k):

in this formula:

and then calculating the logarithmic energy output by each filter bank:

wherein E (m) is logarithmic energy, H_m(k) Is a filter bank;

where M represents the number of filters.

4. The method for detecting the transformer fault based on the acoustic signal and the deep learning technology as claimed in claim 1, wherein the step of constructing the transformer fault detection model comprises the following steps:

wherein:

represents the jth characteristic plan of the first convolutional layer,

which represents the convolution kernel or kernels, is,

which represents the offset of the bias voltage,

and

the specific parameters are determined by training optimization, f represents a excitation layer function, the excitation function is a sigmoid function, and the expression of the excitation function is as follows:

the output of the gating is represented as:

5. The method for detecting the transformer fault based on the acoustic signal and the deep learning technology as claimed in claim 1, wherein the training of the transformer fault detection model comprises the following steps: