CN113707159B

CN113707159B - Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning

Info

Publication number: CN113707159B
Application number: CN202110878327.9A
Authority: CN
Inventors: 邱志斌; 卢祖文; 廖才波; 王海祥
Original assignee: Nanchang University
Current assignee: Nanchang University
Priority date: 2021-08-02
Filing date: 2021-08-02
Publication date: 2024-05-03
Anticipated expiration: 2041-08-02
Also published as: CN113707159A

Abstract

The invention discloses a power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning. Firstly, a database of sound samples of bird species related to power grid bird-related faults is established, after preprocessing operation is carried out on the bird-ringing signals, energy of each frame of signals in each Mel filter is calculated, an M multiplied by N matrix containing information of signal energy is obtained, and the energy size and the color depth are mapped one by one, so that a Mel language graph of the bird-ringing signals is obtained. And training a convolutional neural network through the Mel-language graph, continuously grabbing Mel-language graph characteristics of the learning bird song signal in a convolutional-pooling process, adjusting network internal parameters through repeated iterative training, ending training when the loss between the predicted output value and the actual value of the network is minimum, and finally realizing the prediction and identification of the test bird species. The method can effectively distinguish the characteristics among different bird species singing sounds and realize bird species identification, and can provide references for developing the differential control of the grid bird-related faults.

Description

Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning

Technical Field

The invention relates to the field of power transmission lines, in particular to a power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning.

Background

Bird activity is one of the important reasons for causing faults of overhead transmission lines, and although various bird prevention devices are widely used, the bird prevention devices still have larger blindness, cannot effectively inhibit the rising trend of bird-related faults, and line tripping faults caused by failure of the bird prevention devices also occur at times. In addition, since the bird-related faults have instantaneity, after the faults occur, operation and maintenance personnel often have difficulty in judging the bird species causing the faults, an intelligent bird species identification and fault cause judgment method is lacked, and targeted bird-related fault prevention and control measures are difficult to take. Therefore, intelligent identification research of bird species related to bird related faults of the overhead transmission line is necessary to be carried out, and basis is provided for line operation and maintenance personnel to correctly identify birds.

At present, the traditional bird song identification method is to extract characteristics such as Linear Prediction Cepstrum Coefficient (LPCC), mel cepstrum coefficient (MFCC), power spectral density and the like of sound signals, and combine a Random Forest (RF), a Support Vector Machine (SVM), a Hidden Markov Model (HMM), a Gaussian Mixture Model (GMM) and the like to develop classification prediction, so that the characteristics of the traditional methods are difficult to extract, and the identification accuracy is not high.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention aims to provide a power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning, which is used for identifying the category to which the bird species belong according to the bird song signal and providing basis for developing pertinence and differentiation bird prevention for a power transmission line.

In order to achieve the purpose of the invention, the invention adopts the following technical scheme that the method comprises the following steps:

s1: according to the counted main bird species of the grid bird-related faults and the actual condition of the grid, establishing a related bird species singing database;

s2: for samples in a sound database, denoising, framing and windowing pretreatment are carried out, noise in a bird song signal is removed by adopting improved spectral subtraction of multi-window spectrum estimation, the frame length and the frame shift size are set, the bird song signal is framed, and then a window function is multiplied to increase the continuity of two ends of the frame;

S3: calculating the energy of each frame of the bird song signal in each Mel filter to obtain Mel energy of a song sample, obtaining an MxN-order matrix containing information of the signal energy, mapping the energy size and the color depth one by one to obtain a Mel language graph of the bird song signal, and dividing the Mel language graph into a training set, a verification set and a test set;

s4: constructing a convolutional neural network classification model, carrying out repeated iterative training by taking a Mel language graph of a training set as input, testing a verification set in the training process to adjust parameters of the model, and ending the training when the loss between a predicted output value and an actual value of the network is minimum;

s5: and predicting and identifying the test bird species by using the trained network, and outputting the corresponding bird species.

Further, the calculation process of the Mel-language graph in S3 is as follows: for a segment of M-frame bird song signal, setting N Mel filters, calculating to obtain an MxN matrix by Mel energy, coloring according to the energy size to obtain Mel language graph, wherein the abscissa in Mel language graph is the number of frames and the number of filters respectively, only M x N data amounts need to be calculated, and the calculation time is reduced while the output is simplified.

Further, the convolutional neural network in S4 includes a plurality of convolutional-pooling processes for capturing Mel language graph features, the training set is trained by adjusting network parameters and network iteration times, once the model is trained for a certain turn, the model predicts the verification set once, and correspondingly adjusts parameters according to the prediction result of the verification set, and the model is corrected in a direction with high prediction accuracy until the loss function value of the network is reduced to the minimum, and the network training is finished.

The beneficial effects of the invention are as follows:

The method for identifying the bird-involved faults of the power grid based on the Mel language graph and the deep learning overcomes the limitations of redundancy, large data volume and insufficient distinguishing degree of the traditional voice feature extraction technology, and further promotes accurate bird identification, thereby providing guidance for differential bird prevention and improving the accuracy and the effectiveness of the bird-involved faults prevention of the power transmission line and the transformer substation.

Drawings

FIG. 1 is a flowchart of an implementation of a method for identifying a bird species of a power grid bird-involved fault based on Mel language graphs and deep learning;

FIG. 2 is a graph showing the denoising effect of a bird song signal according to an embodiment of the present invention;

FIG. 3 is a graph of partial bird song waveforms and its Mel language according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a convolutional neural network in an embodiment of the present invention.

Detailed Description

The invention will now be further described with reference to the following examples, which are given solely for the purpose of illustration and are not to be construed as limitations on the scope of the invention, as will be apparent to those skilled in the art upon examination of the foregoing disclosure.

With the rapid development of deep learning, emerging speech recognition methods tend to convert sound signals into spectrograms such as chirp spectrograms, fourier spectrograms, and the like as feature inputs of a deep learning model. The invention adopts a method of converting the bird song sound signal into the Mel language graph and then combining the convolutional neural network to carry out classification and identification to predict and classify the bird species related to the bird-involved fault of the power transmission line.

The following describes the processing of the sound signal, the calculation of Mel language graph and the training process of convolutional neural network of typical bird species related to the bird-related fault of the transmission line in detail, as shown in fig. 1, comprising the following steps:

s1: and establishing a related bird species singing database according to the counted main bird species of the grid bird-related faults and by combining the actual conditions of the grid.

In this embodiment, according to the main bird types of the bird-related faults of the power transmission line counted by the operation staff in a certain province and the actual situation of the power grid, 40 typical birds causing four fault types of bird nest type, bird droppings type, bird body short type and bird pecking type are selected as research objects, including black Dong, fenugreek, magpie, azalea, night heron, large mouth crow, swan, azalea, home swallow, small mouth crow, cliff swallow, davit, spot fish dog, common swallow, pine crow, pond heron, ash green peck bird, ash starling, ash goose, grey crane, circular neck vernonia, white head Bei, white part, balt, balk crow, red mouth gull, red tail, red horn crow, red falcon, green web crow, perk, green duck, green bird, device, goshawk, red crow, silver gull, carving, bone, white bird and white bird, and the data of which are collected from a voice library are obtained.

S2: for samples in a sound database, preprocessing such as denoising, framing and windowing is performed, noise in the bird song signal is removed by adopting improved spectral subtraction of multi-window spectrum estimation, the bird song signal is framed by setting the frame length and the frame shift, and then a window function is multiplied to increase the continuity of two ends of the frame.

In the embodiment, preprocessing operations such as format unification, denoising, framing, windowing and the like are performed on all the bird song audio signals, the sampling frequency of all the audio is set to 16000Hz by GoldWave and Sox software, the sound channels are set to be mono, the audio length is uniformly cut to be 1 second, and the audio is stored in a wav format; framing the audio, respectively setting the frame length and the frame shift to 0.025 seconds and 0.01 seconds, and dividing each audio sample into 98 frames; then selecting a Hamming window for windowing operation so as to increase the continuity of two ends of the frame; the speech is denoised by improved spectral subtraction using multi-window spectral estimation, the denoising effect is shown in fig. 2, (a) is the noise-containing rhododendron speech, and (b) is the denoised rhododendron speech.

S3: the energy of each frame of the bird song signal in each Mel filter is calculated, mel energy of a song sample is obtained, an M multiplied by N order matrix containing information of the signal energy is obtained, the energy size and the color depth degree are mapped one by one, a Mel language graph of the bird song signal is obtained, and the Mel language graph is divided into a training set, a verification set and a test set.

The Mel pattern is an image representation of the bird song signal, and the Mel pattern formed by different bird species varies. In this embodiment, the bird song signal is divided into 98 frames, 40 Mel filters are set, a 98×40 data matrix is obtained by Mel energy calculation, a Mel language graph of the bird song signal can be obtained by coloring according to the energy size, the abscissa in the Mel language graph is the number of frames and the number of the filters, only 98×40 data amounts need to be calculated, and the calculation time is reduced while the output is simplified. Fig. 3 shows waveforms of partial bird song signals and corresponding Mel patterns thereof, wherein (a), (b) and (c) are respectively speech waveforms of rhododendron, red-horn owl and red-mouth gull, and (d), (e) and (f) are respectively Mel patterns of rhododendron, red-horn owl and red-mouth gull, and a section of bird song signals is described by combining frame numbers with the number of Mel filters, so that the sounds of different bird species can be distinguished.

In this embodiment, the acquired Mel language graph is divided into training set, verification set and test set according to the ratio of 8:1:1.

S4: and constructing a convolutional neural network classification model, carrying out repeated iterative training by taking a Mel language graph of a training set as input, testing a verification set in the training process to adjust parameters of the model, and ending the training when the loss between a predicted output value and an actual value of the network is minimum.

In this embodiment, a 24-layer convolutional neural network model is built, as shown in fig. 4, training is performed by taking a training set as input, the convolutional neural network comprises a plurality of convolutional-pooling processes for capturing Mel language graph features, an initial learning rate of 0.01 is set for training the training set, the learning rate is reduced to 1/10 of the original learning rate after 30 rounds of training, once the model is trained for a certain round, the model performs one-time prediction on a verification set, and corresponding adjustment is performed on parameters in the network according to the prediction result of the verification set, and the model is corrected in a direction with high prediction accuracy. The training of the convolutional neural network is essentially a process of minimizing a loss function, and the aim of learning the optimal category of image feature matching is achieved by continuously and iteratively optimizing and seeking the minimum loss between the predicted output value and the actual value of the network. The loss function used by the convolutional neural network in this embodiment is a cross entropy function, and the expression is: m is the total number of samples, k is the number of categories of samples, 1{y _i =j } is the function of the indication, and when the value in brackets is true, the output is 1, otherwise 0,/> Representing the probability that the ith sample is predicted as the jth class. When the loss function value of the network is minimized, the network training is ended.

In this embodiment, since the training samples and the test samples are randomly selected, in order to avoid the accidental of the classification result, classification tests under 3 different training sample sets are performed in total, and the average prediction accuracy is 96.7%. Therefore, the Mel graph of the sound signal is used as the characteristic quantity, the noise is removed by using the improved spectral subtraction of the multi-window spectrum estimation, and the deep learning model is built by using the convolutional neural network, so that the relevant bird species threatening the safe operation of the transmission line can be accurately identified, and guidance is provided for differentiated bird prevention.

It should be understood that parts of the specification not specifically set forth herein are all prior art.

While particular embodiments of the present invention have been described above with reference to the accompanying drawings, it will be understood by those skilled in the art that these are by way of example only, and that various changes and modifications may be made to these embodiments without departing from the principles and spirit of the invention. The scope of the invention is limited only by the appended claims.

Claims

1. A power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning is characterized in that: the method comprises the following steps:

s5: predicting and identifying the test bird species by using the trained network, and outputting corresponding bird species types;

The calculation process of the Mel language graph in S3 is as follows: setting N Mel filters for a segment of M-frame bird song signal, calculating to obtain an MxN matrix by Mel energy, coloring according to the energy size to obtain Mel language graph, wherein the abscissa in Mel language graph is the number of frames and the number of filters respectively, only M x N data amounts need to be calculated, and the calculation time is reduced while the output is simplified;

And S4, the convolutional neural network comprises a plurality of convolutional-pooling processes for grabbing the features of the Mel language graph, the training set is trained by adjusting network parameters and network iteration times, each time the model is trained for a certain round, the model predicts the verification set once, correspondingly adjusts the parameters according to the prediction result of the verification set, and corrects the parameters in the direction of high prediction accuracy until the loss function value of the network is reduced to the minimum, and the network training is finished.