CN113707159B - Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning - Google Patents
Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning Download PDFInfo
- Publication number
- CN113707159B CN113707159B CN202110878327.9A CN202110878327A CN113707159B CN 113707159 B CN113707159 B CN 113707159B CN 202110878327 A CN202110878327 A CN 202110878327A CN 113707159 B CN113707159 B CN 113707159B
- Authority
- CN
- China
- Prior art keywords
- bird
- mel
- training
- network
- language graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013135 deep learning Methods 0.000 title claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 33
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 14
- 238000012360 testing method Methods 0.000 claims abstract description 13
- 239000011159 matrix material Substances 0.000 claims abstract description 7
- 238000011176 pooling Methods 0.000 claims abstract description 4
- 238000012795 verification Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000003595 spectral effect Effects 0.000 claims description 7
- 238000009432 framing Methods 0.000 claims description 5
- 230000037433 frameshift Effects 0.000 claims description 4
- 238000001228 spectrum Methods 0.000 claims description 4
- 238000013145 classification model Methods 0.000 claims description 3
- 238000004040 coloring Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 abstract description 3
- 241000894007 species Species 0.000 description 21
- 230000005540 biological transmission Effects 0.000 description 9
- 230000002265 prevention Effects 0.000 description 8
- 241000208422 Rhododendron Species 0.000 description 6
- 230000005236 sound signal Effects 0.000 description 6
- 241000272168 Laridae Species 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 241000271566 Aves Species 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241000272189 Accipiter gentilis Species 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 241000272525 Anas platyrhynchos Species 0.000 description 1
- 241000272814 Anser sp. Species 0.000 description 1
- 201000004569 Blindness Diseases 0.000 description 1
- 241001137251 Corvidae Species 0.000 description 1
- 240000001689 Cyanthillium cinereum Species 0.000 description 1
- 244000148064 Enicostema verticillatum Species 0.000 description 1
- 241000272184 Falconiformes Species 0.000 description 1
- 241000272170 Larus novaehollandiae Species 0.000 description 1
- 241000272157 Nycticorax nycticorax Species 0.000 description 1
- 241001282110 Pagrus major Species 0.000 description 1
- 241001251757 Petrochelidon pyrrhonota Species 0.000 description 1
- 235000008331 Pinus X rigitaeda Nutrition 0.000 description 1
- 235000011613 Pinus brutia Nutrition 0.000 description 1
- 241000018646 Pinus brutia Species 0.000 description 1
- 241000287181 Sturnus vulgaris Species 0.000 description 1
- 244000250129 Trigonella foenum graecum Species 0.000 description 1
- 235000001484 Trigonella foenum graecum Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000005770 birds nest Nutrition 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000003608 fece Anatomy 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 235000001019 trigonella foenum-graecum Nutrition 0.000 description 1
- 235000005765 wild carrot Nutrition 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/20—Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/02—Preprocessing
- G06F2218/04—Denoising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The invention discloses a power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning. Firstly, a database of sound samples of bird species related to power grid bird-related faults is established, after preprocessing operation is carried out on the bird-ringing signals, energy of each frame of signals in each Mel filter is calculated, an M multiplied by N matrix containing information of signal energy is obtained, and the energy size and the color depth are mapped one by one, so that a Mel language graph of the bird-ringing signals is obtained. And training a convolutional neural network through the Mel-language graph, continuously grabbing Mel-language graph characteristics of the learning bird song signal in a convolutional-pooling process, adjusting network internal parameters through repeated iterative training, ending training when the loss between the predicted output value and the actual value of the network is minimum, and finally realizing the prediction and identification of the test bird species. The method can effectively distinguish the characteristics among different bird species singing sounds and realize bird species identification, and can provide references for developing the differential control of the grid bird-related faults.
Description
Technical Field
The invention relates to the field of power transmission lines, in particular to a power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning.
Background
Bird activity is one of the important reasons for causing faults of overhead transmission lines, and although various bird prevention devices are widely used, the bird prevention devices still have larger blindness, cannot effectively inhibit the rising trend of bird-related faults, and line tripping faults caused by failure of the bird prevention devices also occur at times. In addition, since the bird-related faults have instantaneity, after the faults occur, operation and maintenance personnel often have difficulty in judging the bird species causing the faults, an intelligent bird species identification and fault cause judgment method is lacked, and targeted bird-related fault prevention and control measures are difficult to take. Therefore, intelligent identification research of bird species related to bird related faults of the overhead transmission line is necessary to be carried out, and basis is provided for line operation and maintenance personnel to correctly identify birds.
At present, the traditional bird song identification method is to extract characteristics such as Linear Prediction Cepstrum Coefficient (LPCC), mel cepstrum coefficient (MFCC), power spectral density and the like of sound signals, and combine a Random Forest (RF), a Support Vector Machine (SVM), a Hidden Markov Model (HMM), a Gaussian Mixture Model (GMM) and the like to develop classification prediction, so that the characteristics of the traditional methods are difficult to extract, and the identification accuracy is not high.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to provide a power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning, which is used for identifying the category to which the bird species belong according to the bird song signal and providing basis for developing pertinence and differentiation bird prevention for a power transmission line.
In order to achieve the purpose of the invention, the invention adopts the following technical scheme that the method comprises the following steps:
s1: according to the counted main bird species of the grid bird-related faults and the actual condition of the grid, establishing a related bird species singing database;
s2: for samples in a sound database, denoising, framing and windowing pretreatment are carried out, noise in a bird song signal is removed by adopting improved spectral subtraction of multi-window spectrum estimation, the frame length and the frame shift size are set, the bird song signal is framed, and then a window function is multiplied to increase the continuity of two ends of the frame;
S3: calculating the energy of each frame of the bird song signal in each Mel filter to obtain Mel energy of a song sample, obtaining an MxN-order matrix containing information of the signal energy, mapping the energy size and the color depth one by one to obtain a Mel language graph of the bird song signal, and dividing the Mel language graph into a training set, a verification set and a test set;
s4: constructing a convolutional neural network classification model, carrying out repeated iterative training by taking a Mel language graph of a training set as input, testing a verification set in the training process to adjust parameters of the model, and ending the training when the loss between a predicted output value and an actual value of the network is minimum;
s5: and predicting and identifying the test bird species by using the trained network, and outputting the corresponding bird species.
Further, the calculation process of the Mel-language graph in S3 is as follows: for a segment of M-frame bird song signal, setting N Mel filters, calculating to obtain an MxN matrix by Mel energy, coloring according to the energy size to obtain Mel language graph, wherein the abscissa in Mel language graph is the number of frames and the number of filters respectively, only M x N data amounts need to be calculated, and the calculation time is reduced while the output is simplified.
Further, the convolutional neural network in S4 includes a plurality of convolutional-pooling processes for capturing Mel language graph features, the training set is trained by adjusting network parameters and network iteration times, once the model is trained for a certain turn, the model predicts the verification set once, and correspondingly adjusts parameters according to the prediction result of the verification set, and the model is corrected in a direction with high prediction accuracy until the loss function value of the network is reduced to the minimum, and the network training is finished.
The beneficial effects of the invention are as follows:
The method for identifying the bird-involved faults of the power grid based on the Mel language graph and the deep learning overcomes the limitations of redundancy, large data volume and insufficient distinguishing degree of the traditional voice feature extraction technology, and further promotes accurate bird identification, thereby providing guidance for differential bird prevention and improving the accuracy and the effectiveness of the bird-involved faults prevention of the power transmission line and the transformer substation.
Drawings
FIG. 1 is a flowchart of an implementation of a method for identifying a bird species of a power grid bird-involved fault based on Mel language graphs and deep learning;
FIG. 2 is a graph showing the denoising effect of a bird song signal according to an embodiment of the present invention;
FIG. 3 is a graph of partial bird song waveforms and its Mel language according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a convolutional neural network in an embodiment of the present invention.
Detailed Description
The invention will now be further described with reference to the following examples, which are given solely for the purpose of illustration and are not to be construed as limitations on the scope of the invention, as will be apparent to those skilled in the art upon examination of the foregoing disclosure.
With the rapid development of deep learning, emerging speech recognition methods tend to convert sound signals into spectrograms such as chirp spectrograms, fourier spectrograms, and the like as feature inputs of a deep learning model. The invention adopts a method of converting the bird song sound signal into the Mel language graph and then combining the convolutional neural network to carry out classification and identification to predict and classify the bird species related to the bird-involved fault of the power transmission line.
The following describes the processing of the sound signal, the calculation of Mel language graph and the training process of convolutional neural network of typical bird species related to the bird-related fault of the transmission line in detail, as shown in fig. 1, comprising the following steps:
s1: and establishing a related bird species singing database according to the counted main bird species of the grid bird-related faults and by combining the actual conditions of the grid.
In this embodiment, according to the main bird types of the bird-related faults of the power transmission line counted by the operation staff in a certain province and the actual situation of the power grid, 40 typical birds causing four fault types of bird nest type, bird droppings type, bird body short type and bird pecking type are selected as research objects, including black Dong, fenugreek, magpie, azalea, night heron, large mouth crow, swan, azalea, home swallow, small mouth crow, cliff swallow, davit, spot fish dog, common swallow, pine crow, pond heron, ash green peck bird, ash starling, ash goose, grey crane, circular neck vernonia, white head Bei, white part, balt, balk crow, red mouth gull, red tail, red horn crow, red falcon, green web crow, perk, green duck, green bird, device, goshawk, red crow, silver gull, carving, bone, white bird and white bird, and the data of which are collected from a voice library are obtained.
S2: for samples in a sound database, preprocessing such as denoising, framing and windowing is performed, noise in the bird song signal is removed by adopting improved spectral subtraction of multi-window spectrum estimation, the bird song signal is framed by setting the frame length and the frame shift, and then a window function is multiplied to increase the continuity of two ends of the frame.
In the embodiment, preprocessing operations such as format unification, denoising, framing, windowing and the like are performed on all the bird song audio signals, the sampling frequency of all the audio is set to 16000Hz by GoldWave and Sox software, the sound channels are set to be mono, the audio length is uniformly cut to be 1 second, and the audio is stored in a wav format; framing the audio, respectively setting the frame length and the frame shift to 0.025 seconds and 0.01 seconds, and dividing each audio sample into 98 frames; then selecting a Hamming window for windowing operation so as to increase the continuity of two ends of the frame; the speech is denoised by improved spectral subtraction using multi-window spectral estimation, the denoising effect is shown in fig. 2, (a) is the noise-containing rhododendron speech, and (b) is the denoised rhododendron speech.
S3: the energy of each frame of the bird song signal in each Mel filter is calculated, mel energy of a song sample is obtained, an M multiplied by N order matrix containing information of the signal energy is obtained, the energy size and the color depth degree are mapped one by one, a Mel language graph of the bird song signal is obtained, and the Mel language graph is divided into a training set, a verification set and a test set.
The Mel pattern is an image representation of the bird song signal, and the Mel pattern formed by different bird species varies. In this embodiment, the bird song signal is divided into 98 frames, 40 Mel filters are set, a 98×40 data matrix is obtained by Mel energy calculation, a Mel language graph of the bird song signal can be obtained by coloring according to the energy size, the abscissa in the Mel language graph is the number of frames and the number of the filters, only 98×40 data amounts need to be calculated, and the calculation time is reduced while the output is simplified. Fig. 3 shows waveforms of partial bird song signals and corresponding Mel patterns thereof, wherein (a), (b) and (c) are respectively speech waveforms of rhododendron, red-horn owl and red-mouth gull, and (d), (e) and (f) are respectively Mel patterns of rhododendron, red-horn owl and red-mouth gull, and a section of bird song signals is described by combining frame numbers with the number of Mel filters, so that the sounds of different bird species can be distinguished.
In this embodiment, the acquired Mel language graph is divided into training set, verification set and test set according to the ratio of 8:1:1.
S4: and constructing a convolutional neural network classification model, carrying out repeated iterative training by taking a Mel language graph of a training set as input, testing a verification set in the training process to adjust parameters of the model, and ending the training when the loss between a predicted output value and an actual value of the network is minimum.
In this embodiment, a 24-layer convolutional neural network model is built, as shown in fig. 4, training is performed by taking a training set as input, the convolutional neural network comprises a plurality of convolutional-pooling processes for capturing Mel language graph features, an initial learning rate of 0.01 is set for training the training set, the learning rate is reduced to 1/10 of the original learning rate after 30 rounds of training, once the model is trained for a certain round, the model performs one-time prediction on a verification set, and corresponding adjustment is performed on parameters in the network according to the prediction result of the verification set, and the model is corrected in a direction with high prediction accuracy. The training of the convolutional neural network is essentially a process of minimizing a loss function, and the aim of learning the optimal category of image feature matching is achieved by continuously and iteratively optimizing and seeking the minimum loss between the predicted output value and the actual value of the network. The loss function used by the convolutional neural network in this embodiment is a cross entropy function, and the expression is: m is the total number of samples, k is the number of categories of samples, 1{y i =j } is the function of the indication, and when the value in brackets is true, the output is 1, otherwise 0,/> Representing the probability that the ith sample is predicted as the jth class. When the loss function value of the network is minimized, the network training is ended.
S5: and predicting and identifying the test bird species by using the trained network, and outputting the corresponding bird species.
In this embodiment, since the training samples and the test samples are randomly selected, in order to avoid the accidental of the classification result, classification tests under 3 different training sample sets are performed in total, and the average prediction accuracy is 96.7%. Therefore, the Mel graph of the sound signal is used as the characteristic quantity, the noise is removed by using the improved spectral subtraction of the multi-window spectrum estimation, and the deep learning model is built by using the convolutional neural network, so that the relevant bird species threatening the safe operation of the transmission line can be accurately identified, and guidance is provided for differentiated bird prevention.
It should be understood that parts of the specification not specifically set forth herein are all prior art.
While particular embodiments of the present invention have been described above with reference to the accompanying drawings, it will be understood by those skilled in the art that these are by way of example only, and that various changes and modifications may be made to these embodiments without departing from the principles and spirit of the invention. The scope of the invention is limited only by the appended claims.
Claims (1)
1. A power grid bird-involved fault bird species identification method based on Mel language graphs and deep learning is characterized in that: the method comprises the following steps:
s1: according to the counted main bird species of the grid bird-related faults and the actual condition of the grid, establishing a related bird species singing database;
s2: for samples in a sound database, denoising, framing and windowing pretreatment are carried out, noise in a bird song signal is removed by adopting improved spectral subtraction of multi-window spectrum estimation, the frame length and the frame shift size are set, the bird song signal is framed, and then a window function is multiplied to increase the continuity of two ends of the frame;
S3: calculating the energy of each frame of the bird song signal in each Mel filter to obtain Mel energy of a song sample, obtaining an MxN-order matrix containing information of the signal energy, mapping the energy size and the color depth one by one to obtain a Mel language graph of the bird song signal, and dividing the Mel language graph into a training set, a verification set and a test set;
s4: constructing a convolutional neural network classification model, carrying out repeated iterative training by taking a Mel language graph of a training set as input, testing a verification set in the training process to adjust parameters of the model, and ending the training when the loss between a predicted output value and an actual value of the network is minimum;
s5: predicting and identifying the test bird species by using the trained network, and outputting corresponding bird species types;
The calculation process of the Mel language graph in S3 is as follows: setting N Mel filters for a segment of M-frame bird song signal, calculating to obtain an MxN matrix by Mel energy, coloring according to the energy size to obtain Mel language graph, wherein the abscissa in Mel language graph is the number of frames and the number of filters respectively, only M x N data amounts need to be calculated, and the calculation time is reduced while the output is simplified;
And S4, the convolutional neural network comprises a plurality of convolutional-pooling processes for grabbing the features of the Mel language graph, the training set is trained by adjusting network parameters and network iteration times, each time the model is trained for a certain round, the model predicts the verification set once, correspondingly adjusts the parameters according to the prediction result of the verification set, and corrects the parameters in the direction of high prediction accuracy until the loss function value of the network is reduced to the minimum, and the network training is finished.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110878327.9A CN113707159B (en) | 2021-08-02 | 2021-08-02 | Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110878327.9A CN113707159B (en) | 2021-08-02 | 2021-08-02 | Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113707159A CN113707159A (en) | 2021-11-26 |
CN113707159B true CN113707159B (en) | 2024-05-03 |
Family
ID=78651104
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110878327.9A Active CN113707159B (en) | 2021-08-02 | 2021-08-02 | Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113707159B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106093612A (en) * | 2016-05-26 | 2016-11-09 | 国网江苏省电力公司电力科学研究院 | A kind of method for diagnosing fault of power transformer |
CN107393542A (en) * | 2017-06-28 | 2017-11-24 | 北京林业大学 | A kind of birds species identification method based on binary channels neutral net |
CN108197591A (en) * | 2018-01-22 | 2018-06-22 | 北京林业大学 | A kind of birds individual discrimination method based on multiple features fusion transfer learning |
CN109409308A (en) * | 2018-11-05 | 2019-03-01 | 中国科学院声学研究所 | A method of the birds species identification based on birdvocalization |
CN109979441A (en) * | 2019-04-03 | 2019-07-05 | 中国计量大学 | A kind of birds recognition methods based on deep learning |
CN110120224A (en) * | 2019-05-10 | 2019-08-13 | 平安科技(深圳)有限公司 | Construction method, device, computer equipment and the storage medium of bird sound identification model |
CN111626093A (en) * | 2020-03-27 | 2020-09-04 | 国网江西省电力有限公司电力科学研究院 | Electric transmission line related bird species identification method based on sound power spectral density |
WO2020177371A1 (en) * | 2019-03-06 | 2020-09-10 | 哈尔滨工业大学(深圳) | Environment adaptive neural network noise reduction method and system for digital hearing aids, and storage medium |
CN112331220A (en) * | 2020-11-17 | 2021-02-05 | 中国计量大学 | Bird real-time identification method based on deep learning |
WO2021051608A1 (en) * | 2019-09-20 | 2021-03-25 | 平安科技(深圳)有限公司 | Voiceprint recognition method and device employing deep learning, and apparatus |
-
2021
- 2021-08-02 CN CN202110878327.9A patent/CN113707159B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106093612A (en) * | 2016-05-26 | 2016-11-09 | 国网江苏省电力公司电力科学研究院 | A kind of method for diagnosing fault of power transformer |
CN107393542A (en) * | 2017-06-28 | 2017-11-24 | 北京林业大学 | A kind of birds species identification method based on binary channels neutral net |
CN108197591A (en) * | 2018-01-22 | 2018-06-22 | 北京林业大学 | A kind of birds individual discrimination method based on multiple features fusion transfer learning |
CN109409308A (en) * | 2018-11-05 | 2019-03-01 | 中国科学院声学研究所 | A method of the birds species identification based on birdvocalization |
WO2020177371A1 (en) * | 2019-03-06 | 2020-09-10 | 哈尔滨工业大学(深圳) | Environment adaptive neural network noise reduction method and system for digital hearing aids, and storage medium |
CN109979441A (en) * | 2019-04-03 | 2019-07-05 | 中国计量大学 | A kind of birds recognition methods based on deep learning |
CN110120224A (en) * | 2019-05-10 | 2019-08-13 | 平安科技(深圳)有限公司 | Construction method, device, computer equipment and the storage medium of bird sound identification model |
WO2021051608A1 (en) * | 2019-09-20 | 2021-03-25 | 平安科技(深圳)有限公司 | Voiceprint recognition method and device employing deep learning, and apparatus |
CN111626093A (en) * | 2020-03-27 | 2020-09-04 | 国网江西省电力有限公司电力科学研究院 | Electric transmission line related bird species identification method based on sound power spectral density |
CN112331220A (en) * | 2020-11-17 | 2021-02-05 | 中国计量大学 | Bird real-time identification method based on deep learning |
Non-Patent Citations (1)
Title |
---|
基于Chirplet语图特征和深度学习的鸟类物种识别方法;谢将剑;李文彬;张军国;丁长青;北京林业大学学报;第40卷(第003期);122-127 * |
Also Published As
Publication number | Publication date |
---|---|
CN113707159A (en) | 2021-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Clemins et al. | Automatic classification and speaker identification of African elephant (Loxodonta africana) vocalizations | |
Agamaite et al. | A quantitative acoustic analysis of the vocal repertoire of the common marmoset (Callithrix jacchus) | |
CN105611477B (en) | The voice enhancement algorithm that depth and range neutral net are combined in digital deaf-aid | |
CN109493874A (en) | A kind of live pig cough sound recognition methods based on convolutional neural networks | |
US5864803A (en) | Signal processing and training by a neural network for phoneme recognition | |
CN110718232B (en) | Speech enhancement method for generating countermeasure network based on two-dimensional spectrogram and condition | |
CN109087648A (en) | Sales counter voice monitoring method, device, computer equipment and storage medium | |
CN106847293A (en) | Facility cultivation sheep stress behavior acoustical signal monitoring method | |
CN113707158A (en) | Power grid harmful bird seed singing recognition method based on VGGish migration learning network | |
CN113850013B (en) | Ship radiation noise classification method | |
CN111626093B (en) | Method for identifying related bird species of power transmission line based on sound power spectral density | |
Schröter et al. | Segmentation, classification, and visualization of orca calls using deep learning | |
CN114373452A (en) | Voice abnormity identification and evaluation method and system based on deep learning | |
CN113111786A (en) | Underwater target identification method based on small sample training image convolutional network | |
Xiao et al. | AMResNet: An automatic recognition model of bird sounds in real environment | |
Liu et al. | Classification of cetacean whistles based on convolutional neural network | |
Castro et al. | Automatic manatee count using passive acoustics | |
CN113707159B (en) | Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning | |
Du et al. | A tristimulus-formant model for automatic recognition of call types of laying hens. | |
Zhang et al. | A novel insect sound recognition algorithm based on MFCC and CNN | |
Chaves et al. | Katydids acoustic classification on verification approach based on MFCC and HMM | |
CN111091816B (en) | Data processing system and method based on voice evaluation | |
Mercado III et al. | Classifying animal sounds with neural networks | |
Mitra et al. | Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition | |
CN114299925A (en) | Method and system for obtaining importance measurement index of dysphagia symptom of Parkinson disease patient based on voice |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |