CN112581940A

CN112581940A - Discharging sound detection method based on edge calculation and neural network

Info

Publication number: CN112581940A
Application number: CN202010979821.XA
Authority: CN
Inventors: 缪巍巍; 曾锃; 张震; 李凤强; 李世豪; 王传君; 张厦千; 张明轩; 苏宇
Original assignee: Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Current assignee: Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2020-09-17
Filing date: 2020-09-17
Publication date: 2021-03-30

Abstract

The invention provides a discharging sound detection method based on edge calculation and a neural network, which aims at the partial discharging phenomenon caused by the insulation aging of equipment in an electric power system, provides a signal detection model arranged at an edge node to monitor three states of normal operation, partial discharging and fault occurrence of the electric power equipment in real time, and feeds an abnormal state back to an operation and maintenance center to help the operation and maintenance center to monitor the equipment fault in real time, thereby improving the operation and maintenance response time of the electric power system, avoiding major electric power accidents caused by the insulation degradation of the equipment and reducing the operation and maintenance cost. The detection method is easy to implement, high in detection accuracy and suitable for popularization and use.

Description

Discharging sound detection method based on edge calculation and neural network

Technical Field

The invention relates to the field of voice recognition, in particular to a discharging sound detection method based on edge calculation and a neural network.

Background

High-voltage switch cabinets are one of the most important electrical devices in electrical power systems. The safety and reliability of the power equipment are important links of ultra-large-scale power transmission and distribution and power grid safety guarantee, and the safety and reliability of the high-voltage switch cabinet as widely applied power equipment are also concerned more. According to the statistical information of accidents of 6 kV-10 kV switch cabinets of the national power system between 2005 and 2011, the total number of the accidents caused by insulation and current carrying is 50.2%, wherein the total number of the accidents caused by the deterioration of an insulation part is 79.0%, and the total number of the accidents caused by poor contact of an isolation plug is 71.1%. It can be seen that the rate of failure due to insulation deterioration or contact failure is high. Before the fault occurs, partial discharge and other phenomena may exist in the high-voltage switch cabinet, so that the equipment running state parameters can be obtained by detecting related information. In view of this, how to effectively find the partial discharge and the development rule thereof in the switch cabinet and detect the potential insulation fault in time is a problem that the power supervision department is concerned with and needs to solve more and more urgently, and is also a difficulty and challenge faced by related scientific research personnel and scientific research units. Therefore, the switch cabinet and the ring main unit adopt proper partial discharge live monitoring in actual operation, and the method has great significance.

Deep learning is one of the research fields that have developed rapidly in recent years, and has made a breakthrough in the sub-field of many human intelligence. Most of the early techniques of machine learning and signal processing, which use shallow structures, including support vector machine, gaussian mixture, logistic regression, etc., are in a predicament when some complex natural signals are involved. Therefore, researchers have proposed a more efficient deep learning method by simulating a deep hierarchical structure in systems such as human vision and hearing, extracting a complex structure from rich sensory input signals, and establishing an internal representation.

Since 90 s, the pattern recognition method is introduced into the field of partial discharge defect type recognition, but partial discharge signals of GIS and large transformers are mostly researched, and the research level is still in the first stage. Neural Network (NN) recognition is currently widely used, and is a machine learning method that follows the principle of empirical risk minimization.

Disclosure of Invention

The technical purpose of the invention is to provide a discharging sound detection method based on edge calculation and a neural network, which is used for carrying out training by utilizing known discharging sound data and carrying out classification detection on actual discharging sound data.

The technical scheme of the invention is as follows:

a discharging sound detection method based on edge calculation and a neural network is characterized by comprising the following steps:

collecting voice samples of the power equipment in three states of normal work, partial discharge and failure, extracting audio features of the voice samples, labeling and constructing a data set;

step (B) establishing a neural network model with a multi-classification function, training the model by using the data set established in the step (A), and compiling the model by using multi-classification cross entropy and adam optimization algorithm;

step (C) carrying out accuracy and error analysis on the trained model;

step (D), taking the model with the standard accuracy as a target model, deploying the target model at an edge node, detecting the electric leakage state of the power equipment at a node terminal by using the target model, and returning the detection result to an operation and maintenance center;

and (E) aiming at the detection result returned by the edge node, the operation and maintenance center carries out data statistics and analysis at the cloud end and makes a corresponding operation and maintenance response.

On the basis of the above scheme, a further improved or preferred scheme further comprises:

in the step (B), the neural network model with multi-classification functions comprises an input layer, a hidden layer and an output layer;

the calculation method of the hidden layer is as follows:

hidden＝f(W×X+b)

wherein, hide layer output, f is a nonlinear activation function, W is the weight of the network, X is the input layer vector, i.e. the audio features extracted in step (A), and b is the bias of the network;

the calculation method of the output layer is as follows:

Y＝softmax(W_Y×hidden_last+b_Y)

wherein Y is the output of the output layer, hidden_lastIs the output value of the last hidden layer, W_YIs the weight of the output layer, b_YIs the offset of the output layer, softmax is the activation function of the output layer;

since here three leakage states are detected, W is_YIs a two-dimensional matrix with one dimension of 3, and b_YThen a length 3 vector;

the softmax function is defined as follows:

wherein, S is 3, corresponding to 3 states of the power equipment, j is any one of the 3 states, and j is more than or equal to 1 and less than or equal to 3; the formula represents the probability that the speech sample x is judged to be in the state j, and when the output node is selected finally, the node with the maximum probability is selected as the prediction target.

Further, in step (B), the multi-class cross-entropy loss function is defined as follows:

where C denotes the number of speech samples, Y_qIs the classification result of the q-th speech sample by the neural network model, y_qIt is the true label for the sample.

Further, in step (C), the accuracy is defined as:

wherein P is the accuracy, TP is the number of samples correctly classified by the model, and FP is the number of samples incorrectly classified by the model.

Further, the audio features are one or more of short-time average energy, short-time average amplitude function, short-time average zero crossing rate, short-time autocorrelation function, mel-frequency cepstrum correlation parameter, and formant correlation parameter.

Preferably, the present invention adopts mel cepstrum related parameters as audio features, and the specific process of step (a) includes:

a1) extracting the same voice frame number for each voice sample, and then extracting corresponding voice characteristics for each frame of voice to prepare for model training;

speech signal x after windowing of a frame_i(n) calculating its fast fourier transform, where n is the sample point of the speech timing and i is the index of the frame;

transformation from time domain data to frequency domain data:

X_i(k)＝FFT[x_i(n)]

a2) FFT of each frame_i(k) Calculating the spectral line energy E_i(k) K denotes the kth spectral line in the frequency domain;

E_i(k)＝|X_i(k)|²

a3) passing the energy of each frame of spectral line through a Mel filter bank, calculating its energy in each filter bank

Where N is the length of FFT change, M is the number of filters, M is the total number of filters, H_m(k) Is the frequency response of the mel filter:

wherein, f (m), f (m-1) and f (m +1) respectively represent the center frequencies of the m-th filter, the m-1-th filter and the m + 1-th filter, and the calculation formulas of f (m-1) and f (m +1) are analogized according to the calculation formula of f (m);

f_land f_HRespectively the lowest and highest frequency of the filter frequency range, N the length of the FFT variation, f_sIs the sampling frequency, F_Mel() A function that converts the actual frequency in brackets to mel-frequency is shown,

is F_Mel() The inverse function of (1), i.e. conversion of mel frequency to actual frequency;

a4) decorrelating the S obtained in step a3) by means of a Discrete Cosine Transform (DCT)_i(m) substituting the following formula to obtain a final voice characteristic parameter MFCC;

in the above formula, mfcc_iAnd (n) extracting corresponding voice characteristics of the ith frame voice.

Has the advantages that:

the invention provides a discharging sound detection method based on edge calculation and a neural network, which is used for monitoring three states of normal operation, partial discharge and fault occurrence of electric power equipment in real time through a signal detection model arranged at an edge node aiming at a partial discharge phenomenon caused by insulation aging of the equipment in an electric power system, and feeding back an abnormal state to an operation and maintenance center, thereby helping the operation and maintenance center to monitor the equipment fault in real time, improving the operation and maintenance response time of the electric power system, avoiding major electric power accidents caused by insulation degradation of the equipment and reducing the operation and maintenance cost. Compared with the traditional leakage detection, the method needs a professional to detect on site, and all-weather unmanned monitoring can be realized through the terminal edge node; compare in traditional monitoring facilities, need the professional just can understand data, differentiate the electric leakage state, here can directly give the electric leakage state through the model probability, reduce the dependence to relevant professional knowledge.

Drawings

FIG. 1 is a simplified flow diagram of the method of the present invention;

FIG. 2 is a block diagram of a neural network model for multiple classification functions;

FIG. 3 is a diagram of a system architecture corresponding to the detection method of the present invention;

FIG. 4 is a graph of the accuracy of the trained model in the training set and the validation set;

FIG. 5 is a graph of the error curves of the trained model over the training set and the validation set;

fig. 6 is a schematic diagram of MFCC feature parameter extraction.

Detailed Description

To clarify the technical solution and working principle of the present invention, the present invention will be further described with reference to the accompanying drawings and specific embodiments.

As shown in fig. 1, a method for detecting a sounding discharge based on edge calculation and a neural network specifically includes the following steps:

(A) the method comprises the steps of collecting voice samples of the power equipment in three states of normal work, partial discharge and failure by using a voice sensor, extracting audio features of the voice samples, marking the voice samples with labels, and constructing a data set, wherein the data set comprises a training set, a verification set and a test set.

The audio features are one or more of short-time average energy, short-time average amplitude function, short-time average zero-crossing rate, short-time autocorrelation function, mel-frequency cepstrum related parameters and formant related parameters. In this embodiment, the Mel Frequency Cepstrum Coefficients (MFCCs) are used as audio features, and the Mel Frequency Cepstrum Coefficients are preferably selected.

Mel Frequency Cepstrum Coefficients (MFCCs) are used to analyze the Frequency spectrum of speech according to the results of human auditory experiments, because the division of the human subjective perceptual Frequency domain is not linear, and the relationship between the Frequency domain and the actual Frequency is shown in the following formula.

F_Mel＝1125×log(1+f/700)

In the formula, F_MelIs the perceived frequency in Mel (Mel) units, and f is the actual frequency in Hz. The MFCC characteristic parameter extraction principle is shown in fig. 6, and the specific process is as follows:

transformation from time domain data to frequency domain data:

X_i(k)＝FFT[x_i(n)]

E_i(k)＝|X_i(k)|²

is F_Mel() The inverse function of (i.e. conversion of mel frequency to actual frequency),

d represents the argument of the inverse function, namely the mel frequency;

(B) Establishing a neural network model with a multi-classification function, training the model by using the data of the training set, and compiling the model by using a multi-classification cross entropy and adam optimization algorithm.

In this step, the neural network model with multi-classification functions includes an input layer, a hidden layer and an output layer; the calculation method of the hidden layer is as follows:

hidden＝f(W×X+b)

wherein, hide the output of layer, f is the nonlinear activation function, W is the weight of the network, X is the input layer vector, namely the audio characteristic extracted in step (A), b is the bias of the network;

when a plurality of hidden layers exist, the calculation method of each hidden layer is as follows:

hidden_p＝f_p(W_p×hidden_p-1+b_p)

where p is the number of the hidden layer, W_pAs a weight of the p-th hidden layer, b_pFor biasing of the p-th hidden layer, f_pHidden function for p-th hidden layer, hidden_pFor output of p-th hidden layer, hidden_p-1And hiding the output value of the layer above the p-th layer.

For the first layer hidden layer, the input value is the audio feature MFCC extracted above, the weight and bias of each hidden layer may be different, the activation function is not necessarily required to be the same, and finally output is obtained through softmax.

A softmax function, which maps the outputs of a plurality of neurons into the interval between (0,1), is defined as follows:

wherein, S is 3 corresponding to 3 states of the power equipment, j is any one of 3 states, j is more than or equal to 1 and less than or equal to 3, x_jIndicating that the speech sample x belongs to the j leakage state,

in its exponential form, the form of the,

this represents a summary of all leakage states, and this equation represents the probability of discriminating it as state j for speech sample x.

Since the task of the detection of the sparkling voice is a multi-classification problem, the activation function of the output layer uses a normalized exponential function, namely a softmax function, which is to map the outputs of a plurality of neurons into the interval between (0,1), and then when the output node is selected finally, the node with the maximum probability can be selected as the prediction target.

The calculation method of the output layer is as follows:

Y＝softmax(W_Y×hidden_last+b_Y)

among them, hidden_lastIs the output value of the last hidden layer, W_YIs the weight of the output layer, b_YIs the bias of the output layer and softmax is the activation function of the output layer.

Since here three leakage states are detected, W is_YIs a two-dimensional matrix with one dimension of 3, and b_YIt is a length 3 vector.

The cross entropy describes the distance between two probability distributions, which is a loss function widely used in the classification problem, and in this embodiment, the multi-classification cross entropy loss function is defined as follows:

where C denotes the number of speech samples, Y_qIs the classification result of the q-th speech sample by the neural network model, y_qIt is the true label for the sample. Since the program index is typically computed starting from 0 and the total is C, starting from 0, the upper bound of the summation is C-1. It is also possible to start with 1, when the upper summation limit is C.

In order to make the result of the model output close to the real result, it is required to minimize the above-mentioned loss function, i.e., to minimize the entropy between the model output and the real result. For this purpose, Adam optimization algorithms were introduced. The Adam optimization algorithm is a first-order optimization algorithm that can replace the traditional stochastic gradient descent process, and can iteratively update neural network weights based on training data. The optimization algorithm integrates the advantages of an adaptive gradient algorithm (AdaGrad) and a root mean square propagation (RMSProp) algorithm, Adam not only calculates the adaptive parameter learning rate based on the first moment mean value, but also fully utilizes the second moment mean value of the gradient, which is the prior art, so that the description is not repeated.

In the embodiment, a deep learning method is used, and a known data set is trained to obtain a discharge sound multi-classification detection model. By applying a multi-classification cross entropy loss function and an adam optimization algorithm, independent adaptive learning rates are designed for different parameters by calculating first moment estimation and second moment estimation of the gradient, so that the model training efficiency can be improved, and the robustness of the recognition effect can be enhanced.

(C) Performing accuracy and error analysis on the trained model

The accuracy is defined as:

where P is the accuracy and TP is the number of samples for the correct classification of the model, i.e., argmax (Y)_i)＝argmax(y_i) (ii) a FP is the number of samples for model misclassification.

(D) The method comprises the steps of taking a model with the standard accuracy as a target model, deploying the target model at an edge node, detecting the electric leakage state of the power equipment at the edge node terminal by using the target model based on audio data fed back by a voice sensor, namely judging the working state of the power equipment, returning a detection result to an operation and maintenance center through an edge computing private network, and helping the operation and maintenance center to monitor equipment faults in real time.

(E) And aiming at the detection result returned by the edge node, the operation and maintenance center performs data statistics and analysis at the cloud end to make a corresponding operation and maintenance response, so that a major power accident caused by insulation degradation of equipment is avoided, and the operation and maintenance cost is reduced.

To verify the validity of the method of the embodiment, the data set is as follows: 2: 2 (training set, verification set, test set), randomly dividing, setting batch _ size to 50, and training round number epochs to 30, and finally obtaining a graph 4 of the error and accuracy of the model on the training set and the verification set, as shown in fig. 5. As can be seen from the figure, when the number of training rounds reaches 16 rounds, the precision of the data set on the test set and the verification set reaches 98.15%, the error is gradually reduced, and the accuracy on the test set is 96.3%.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the foregoing description only for the purpose of illustrating the principles of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims, specification, and equivalents thereof.

Claims

1. A discharging sound detection method based on edge calculation and a neural network is characterized by comprising the following steps:

step (C) carrying out accuracy and error analysis on the trained model;

2. The method for detecting sparkling sound based on edge calculation and neural network of claim 1, wherein in the step (B), the neural network model with multi-classification function comprises an input layer, a hidden layer and an output layer;

the calculation method of the hidden layer is as follows:

hidden＝f(W×X+b)

hidden layer output, and f (W multiplied by X + b) is a nonlinear activation function, wherein W is the weight of the network, X is an input layer vector, namely the audio feature extracted in the step (A), and b is the bias of the network;

the calculation method of the output layer is as follows:

Y＝softmax(W_Y×hidden_last+b_Y)

the softmax function is defined as follows:

wherein, S is 3, corresponding to 3 states of the power equipment, j is any one of the 3 states, and j is more than or equal to 1 and less than or equal to 3;

the formula represents the probability that the speech sample x is judged to be in the state j, and when the output node is selected finally, the node with the maximum probability is selected as the prediction target.

3. The method for detecting sound discharge based on edge computing and neural network of claim 1, wherein in step (B), the multi-class cross entropy loss function is defined as follows:

4. The method for detecting sound discharge based on edge calculation and neural network as claimed in claim 1, wherein in step (C), the accuracy is defined as:

5. The method as claimed in any one of claims 1-4, wherein the audio features are one or more of a short-time average energy, a short-time average amplitude function, a short-time average zero-crossing rate, a short-time autocorrelation function, a Mel cepstrum correlation parameter, and a formant correlation parameter.

6. The method for detecting sparkling sound based on edge computing and neural network as claimed in claim 5, wherein Mel cepstrum related parameters are used as audio features, and the specific process of step (A) comprises:

transformation from time domain data to frequency domain data:

X_i(k)＝FFT[x_i(n)]

E_i(k)＝|X_i(k)|²