CN113707158A - Power grid harmful bird seed singing recognition method based on VGGish migration learning network - Google Patents

Power grid harmful bird seed singing recognition method based on VGGish migration learning network Download PDF

Info

Publication number
CN113707158A
CN113707158A CN202110878305.2A CN202110878305A CN113707158A CN 113707158 A CN113707158 A CN 113707158A CN 202110878305 A CN202110878305 A CN 202110878305A CN 113707158 A CN113707158 A CN 113707158A
Authority
CN
China
Prior art keywords
bird
vggish
network
singing
spectrogram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110878305.2A
Other languages
Chinese (zh)
Inventor
邱志斌
王海祥
廖才波
卢祖文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang University
Original Assignee
Nanchang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang University filed Critical Nanchang University
Priority to CN202110878305.2A priority Critical patent/CN113707158A/en
Publication of CN113707158A publication Critical patent/CN113707158A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/20Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The invention discloses a power grid harmful bird seed singing recognition method based on a VGGish migration learning network. The method comprises the steps of firstly establishing a power grid harmful bird species audio frequency library according to bird species information of historical bird-related faults and bird species investigation results around a power grid, then carrying out preprocessing such as framing, windowing, deep learning noise reduction and cutting on a bird song signal, calculating a bird song signal spectrogram, mapping the bird song signal spectrogram to a 64-order Mel filter bank to obtain a Mel frequency spectrogram, and taking the Mel frequency spectrogram as the input of the network. Aiming at the problem of weak generalization capability of the traditional bird song recognition model caused by insufficient sample quantity, a migration learning method is adopted, a VGGish network pre-trained on an AudioSet data set is utilized to extract 128-dimensional bird song VGGish characteristics, the characteristics are subjected to dimensionality reduction through a principal component analysis method, and finally a classification network is utilized to recognize the migration characteristics. The method can effectively identify different bird species, and is beneficial to realizing the precise prevention and treatment of the power grid bird-related fault.

Description

Power grid harmful bird seed singing recognition method based on VGGish migration learning network
Technical Field
The invention relates to the field of power transmission lines, in particular to a power grid harmful bird seed singing recognition method based on a VGGish migration learning network.
Background
The types of birds which often move around the power grid are various, different birds have different habits, and therefore the caused fault types are different, and the bird-related faults mainly comprise four types, namely bird dung, bird nests, bird pecks and bird short circuits. In order to ensure effective prevention and control of power grid fault tripping caused by bird activities, corresponding prevention and control measures need to be made according to different types of birds and bird-related fault types, but due to the lack of necessary bird identification means and the extremely lack of bird knowledge of power grid operation and maintenance personnel on activities around a power grid, accurate prevention and control of bird-related faults are difficult to achieve, and therefore intelligent identification of bird types related to the bird-related faults of the power grid is necessary.
Common methods for bird species identification are image identification and bird song identification. Image recognition recognizes birds by using the characteristics of the shape, color, texture, and the like of birds, but has a problem that the recognition effect of birds on moving birds and night activities is not ideal. Bird song recognition is to analyze bird song signals and classify bird song signals according to differences of different bird song sounds. Because the dimension of the traditional characteristic parameter is low, the expression capability of the traditional characteristic parameter on the bird song characteristic is insufficient, and the traditional bird song identification algorithm can only identify few kinds of bird songs. With the development of computer vision technology, the visualization of the bird song signal is realized by converting the audio signal into a time-frequency spectrogram, the bird song signal is identified by taking the time-frequency spectrogram as a characteristic and combining a convolutional neural network, but a convolutional neural network model for training bird song identification needs a large number of bird sound samples, and the identification effect is not ideal because the acquisition of the bird song signal of bird species which is endangered by a power grid is difficult.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a power grid harmful bird seed singing sound identification method based on a VGGish migration learning network, so that the identification accuracy of a power grid harmful bird seed singing signal is improved, and a reference can be provided for power grid operation and maintenance personnel to perform bird-related fault prevention and control.
In order to achieve the purpose, the invention adopts the following technical scheme, which comprises the following steps:
a power grid harmful bird seed singing recognition method based on a VGGish migration learning network comprises the following steps:
s1: establishing a power grid harmful bird species audio sample library according to bird species information of historical bird-related faults and bird species investigation results around a power grid;
s2: preprocessing audio, and utilizing deep learning to reduce noise of the bird song audio, wherein the deep learning noise reduction is to train a convolution neural network by using a bird song signal containing noise and a pure bird song signal to obtain a bird song deep learning noise reduction model, and filtering noise in the bird song signal by using the noise reduction model;
s3: calculating a singing signal spectrogram, acquiring a Mel spectrogram of the singing signal spectrogram, retraining a VGGish model pre-trained on an AudioSet data set by taking the Mel spectrogram as input of a network, finely adjusting network weight to obtain a VGGish feature extraction network for the singing, and extracting a singing VGGish feature capable of highly summarizing singing information by using the network;
s4: dimensionality reduction is carried out on the bird song VGGish characteristics through a principal component analysis method, high-dimensional characteristics are mapped to low dimensionality, the bird song VGGish characteristics are described again through principal components, the correlation of the characteristics is reduced, and redundant characteristic interference is reduced;
s5: dividing the bird singing VGGish features subjected to feature dimensionality reduction into a training set, a testing set and a verification set according to a certain proportion, training the recognition network by using the training set, adjusting network parameters by using the verification set to obtain a VGGish feature recognition model, testing the recognition network by using the testing set, and outputting a recognition result.
Further, S2 includes preprocessing such as normalization, framing, windowing, and fast fourier transform.
Further, in S2, deep learning is used to perform noise reduction processing, the time-frequency masking, frequency spectrum mapping, and signal approximation methods are used to obtain the spectral characteristics of the noise-reduced chirp signal, and the network parameters are further adjusted to obtain the noise-reduced network model.
Further, in step S3, calculating energy spectrum density of each frame of signal, where the horizontal axis represents time, the vertical axis represents frequency, and the color depth represents energy spectrum density, to generate a chirp spectrogram, and mapping the spectrogram into a 64-order Mel filter bank to generate a Mel spectrogram based on human auditory mechanism;
further, the VGGish feature recognition network in S5 includes a support vector machine, a convolutional neural network and a long-short term memory network.
The invention has the beneficial effects that:
the grid nuisance bird seed singing recognition method based on the VGGish migration learning network provided by the invention overcomes the problem that the generalization capability of the traditional bird singing recognition model is weak due to insufficient sample quantity, extracts 128-dimensional VGGish characteristics highly summarizing bird singing information based on the migration learning idea, can obtain excellent effect by combining with a classification network, can effectively recognize different bird seeds, and is beneficial to realizing the accurate prevention and control of grid bird-related faults.
Drawings
FIG. 1 is a flow chart of a method for identifying the singing of the harmful bird species in the power grid based on a VGGish migration learning network in the invention;
FIG. 2 is a comparison graph of the noise reduction effect of the bird song signal deep learning algorithm in the embodiment of the invention;
FIG. 3 is a graph of the spectrum of a bird song Mel in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a VGGish migration learning network in an embodiment of the present invention;
FIG. 5 is a VGGish signature of a bird song signal in an embodiment of the invention;
FIG. 6 shows the VGGish feature after the bird song signal feature is reduced in dimension in the embodiment of the invention;
fig. 7 shows the recognition results of 38 kinds of bird song signals in the embodiment of the present invention.
Detailed Description
The present invention is further described in the following examples, which should not be construed as limiting the scope of the invention, but rather as providing the following examples which are set forth to illustrate and not limit the scope of the invention.
The following is detailed by preprocessing the singing signal of the typical bird species with the power grid fault, extracting VGGish features and classifying and identifying, and a flow chart is shown in fig. 1. The method comprises the following steps:
s1: firstly, according to bird species information of historical bird-involved faults and bird species investigation results around a power grid, 18 bird species with high risk, 18 bird species with micro harm and 2 bird species with no harm of the power grid are selected, 38 bird species are counted, related bird species audios are collected, a bird song audio library is established, and names and sample numbers of the bird species with high risk and micro harm are shown in the table 1.
TABLE 1
Figure BDA0003191031290000031
Figure BDA0003191031290000041
S2: a large number of bird singing signals with noises related to power grid faults and bird singing signals without noises are used for training a convolutional neural network to obtain a bird singing noise reduction network model, and the model is used for carrying out noise reduction on 38 bird singing signals, wherein the noise reduction of typical bird audio signals is carried out before and after, for example, the noise reduction of audio signals of osprey, magpie, falcon, phoenix-headed wheat, aigren and azalea in a comparison mode before and after the noise reduction of the audio signals of fig. 2, (a) - (f) are respectively osprey, magpie, red falcon, phoenix-headed wheat, gren and azalea.
S3: the drawing process of the spectrogram comprises framing, windowing, fast Fourier transform, energy spectrum density calculation and spectrogram drawing. Framing the bird song signal in 20ms duration, performing windowing by using a continuous Hanning window, and calculating the energy spectrum density by the following steps
Ei(k)=[X′i(k)]2 (1)
Wherein, X'i(k) Representing the noise reduced chirp frequency domain signal. Calculating the energy spectral density of each frame of the bird song signal through a formula (1), coloring according to the calculated numerical value to generate a corresponding spectrogram, and then superposing the spectrogram correspondingly generated by each frame in 10ms frames to obtain a complete bird song signal spectrogram.
S4: the Mel spectrogram is a spectral image based on the auditory characteristics of human ears. Because the actual frequency is used in the spectrogram calculating process, the spectrogram has low discrimination in the frequency domain and is easily influenced by the masking effect. In order to reduce masking effect and improve frequency domain discrimination, a group of filter banks based on human auditory mechanism is arranged, actual frequency in a spectrogram is converted into perception frequency based on human auditory, the group of filters is called Mel filter bank, and the expression of the Mel filter bank is
Figure BDA0003191031290000042
In the formula, Hm(k) For the frequency response of the triangular filter, m denotes the mth filter, f (m) is the center frequency of the triangular filter, which is defined as:
Figure BDA0003191031290000043
in the formula (f)lIs the lowest frequency of the filter; f. ofhIs the highest frequency of the filter; n is the length of the fast Fourier transform; f. ofsIs the audio sampling frequency;
Figure BDA0003191031290000044
is FmelInverse function of (1), FmelIn relation to the actual frequency of
Figure BDA0003191031290000051
And setting a Mel filter bank consisting of 64 filters, and mapping the actual frequency in the spectrogram onto the Mel filter bank to generate a Mel spectrogram. The generated Mel spectrograms are divided in 0.96s duration, re-framing is carried out in 10ms duration of each frame, no frame overlapping exists between frames, 96 frames are counted, namely the size of each generated Mel spectrogram is 96 multiplied by 64, typical bird species Mel spectrograms are shown in fig. 3, and (a) - (f) are respectively the Mel spectrograms of osprey, magpie, hawk, phoenix-headed chicken, aigren and rhododendron.
S5: the VGGish migration learning network is a VGG-like model trained on AudioSet data sets. The input size of the network is changed to 96 × 64 × 1 and the last set of convolutions and the maximum pooling layer are removed, the network is composed of 4 sets of convolutions, 4 pooling layers, 8 relus, and 3 full-link layers. The convolution kernel size is 3 multiplied by 3 in the convolution process, the step length is 1, the input and output sizes are kept unchanged after convolution, and the number of channels is increased. The pooling process has pooling kernel size of 2 × 2 and step size of 2, and the output is changed to input size 1/2 after pooling, with unchanged depth. The size of the last full connection layer of VGGish is also changed from 1000 to 128, the VGGish acts as an embedded layer, 128-dimensional VGGish features are finally output, and the network structure is shown in FIG. 4. Taking a Mel spectrogram with the size of 96 multiplied by 64 multiplied by 1 generated by a bird song signal as the input of a VGGish migration learning network, training network parameters and extracting 128-dimensional bird song VGGish characteristics, wherein the network output format is [ Num, 128], and Num is expressed as
Figure BDA0003191031290000052
Wherein 0.96 represents the time length of each Mel spectrogram, and typical bird species VGGish characteristics are shown in FIG. 5, (a) - (f) are the VGGish characteristics of osprey, magpie, falcon, phoenix-headed wheat chicken, heron and azalea respectively.
S6: the VGGish features generated by the bird song signals have a plurality of single features with zero values and do not contain useful information, so that feature dimensionality reduction can be carried out on the generated features by utilizing a principal component analysis method, and the VGGish features after typical bird species feature dimensionality reduction are shown in fig. 6, wherein (a) - (f) are respectively the VGGish features after feature dimensionality reduction of osprey, magpie, falcon, phoenix-headed chicken, heron and azalea.
S7: after the bird audio VGGish features are extracted, the bird audio VGGish features can be used as input features of other recognition networks, recognition is carried out by utilizing a convolutional neural network, a long-term and short-term memory network and a support vector machine, and classification can also be carried out by directly connecting a softmax layer.
S8: dividing the data set according to the proportion of the training set, the verification set and the test set of 6:2:2, retraining the VGGish migration learning network by using a bird audio database, wherein the audio overall recognition accuracy of the 38 bird test sets reaches 94.43%, and the recognition results of the 38 bird singing signals are shown in figure 7.
Although specific embodiments of the present invention have been described above with reference to the accompanying drawings, it will be appreciated by those skilled in the art that these are merely illustrative and that various changes or modifications may be made to these embodiments without departing from the principles and spirit of the invention. The scope of the invention is only limited by the appended claims.

Claims (4)

1. A power grid harmful bird seed singing recognition method based on a VGGish migration learning network is characterized by comprising the following steps of; the method comprises the following steps:
s1: establishing a power grid harmful bird species audio sample library according to bird species information of historical bird-related faults and bird species investigation results around a power grid;
s2: preprocessing audio, and utilizing deep learning to reduce noise of the bird song audio, wherein the deep learning noise reduction is to train a convolution neural network by using a bird song signal containing noise and a pure bird song signal to obtain a bird song deep learning noise reduction model, and filtering noise in the bird song signal by using the noise reduction model;
s3: calculating a singing signal spectrogram, acquiring a Mel spectrogram of the singing signal spectrogram, retraining a VGGish model pre-trained on an AudioSet data set by taking the Mel spectrogram as input of a network, finely adjusting network weight to obtain a VGGish feature extraction network for the singing, and extracting a singing VGGish feature capable of highly summarizing singing information by using the network;
s4: dimensionality reduction is carried out on the bird song VGGish characteristics through a principal component analysis method, high-dimensional characteristics are mapped to low dimensionality, the bird song VGGish characteristics are described again through principal components, the correlation of the characteristics is reduced, and redundant characteristic interference is reduced;
s5: dividing the bird singing VGGish features subjected to feature dimensionality reduction into a training set, a testing set and a verification set according to a certain proportion, training the recognition network by using the training set, adjusting network parameters by using the verification set to obtain a VGGish feature recognition model, testing the recognition network by using the testing set, and outputting a recognition result.
2. The VGGish migration learning network-based grid nuisance bird seed singing recognition method of claim 1, wherein: in S2, normalization, framing, windowing, and fast fourier transform preprocessing are performed on the birdsong signal.
3. The VGGish migration learning network-based grid nuisance bird seed singing recognition method of claim 1, wherein: in the step S3, energy spectrum density of each frame of signal is calculated, time is represented by a horizontal axis, frequency is represented by a vertical axis, and energy spectrum density is represented by color depth, so as to generate a chirp spectrogram, and the spectrogram is mapped into a 64-order Mel filter bank, so as to generate a Mel spectrogram based on human auditory mechanism.
4. The VGGish migration learning network-based grid nuisance bird seed singing recognition method of claim 1, wherein: the identification network in the S5 comprises a support vector machine, a convolutional neural network and a long-short term memory network.
CN202110878305.2A 2021-08-02 2021-08-02 Power grid harmful bird seed singing recognition method based on VGGish migration learning network Pending CN113707158A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110878305.2A CN113707158A (en) 2021-08-02 2021-08-02 Power grid harmful bird seed singing recognition method based on VGGish migration learning network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110878305.2A CN113707158A (en) 2021-08-02 2021-08-02 Power grid harmful bird seed singing recognition method based on VGGish migration learning network

Publications (1)

Publication Number Publication Date
CN113707158A true CN113707158A (en) 2021-11-26

Family

ID=78651107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110878305.2A Pending CN113707158A (en) 2021-08-02 2021-08-02 Power grid harmful bird seed singing recognition method based on VGGish migration learning network

Country Status (1)

Country Link
CN (1) CN113707158A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114067368A (en) * 2022-01-17 2022-02-18 国网江西省电力有限公司电力科学研究院 Power grid harmful bird species classification and identification method based on deep convolution characteristics
CN114863937A (en) * 2022-05-17 2022-08-05 武汉工程大学 Hybrid birdsong identification method based on deep migration learning and XGboost
CN117238299A (en) * 2023-11-14 2023-12-15 国网山东省电力公司电力科学研究院 Method, system, medium and equipment for optimizing bird voice recognition model of power transmission line
CN117727309A (en) * 2024-02-18 2024-03-19 百鸟数据科技(北京)有限责任公司 Automatic identification method for bird song species based on TDNN structure

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107369451A (en) * 2017-07-18 2017-11-21 北京市计算中心 A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase
CN109117732A (en) * 2018-07-16 2019-01-01 国网江西省电力有限公司电力科学研究院 A kind of transmission line of electricity relates to the identification of bird failure bird kind figure sound and control method
CN110246504A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Birds sound identification method, device, computer equipment and storage medium
CN111833895A (en) * 2019-04-23 2020-10-27 北京京东尚科信息技术有限公司 Audio signal processing method, apparatus, computer device and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107369451A (en) * 2017-07-18 2017-11-21 北京市计算中心 A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase
CN109117732A (en) * 2018-07-16 2019-01-01 国网江西省电力有限公司电力科学研究院 A kind of transmission line of electricity relates to the identification of bird failure bird kind figure sound and control method
CN111833895A (en) * 2019-04-23 2020-10-27 北京京东尚科信息技术有限公司 Audio signal processing method, apparatus, computer device and medium
CN110246504A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Birds sound identification method, device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SRUTHI KURADA ET AL.: "Poster:VGGish Embeddings Based Audio Classifiers to Improve Parkinson’s Disease Diagnosis", 2020 IEEE/ACM INTERNATIONAL CONFERENCE ON CONNECTED HEALTH:APPLICATION,SYSTEMS AND ENGINEERING TECHNOLOGY(CHASE), pages 54 - 10 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114067368A (en) * 2022-01-17 2022-02-18 国网江西省电力有限公司电力科学研究院 Power grid harmful bird species classification and identification method based on deep convolution characteristics
CN114863937A (en) * 2022-05-17 2022-08-05 武汉工程大学 Hybrid birdsong identification method based on deep migration learning and XGboost
CN117238299A (en) * 2023-11-14 2023-12-15 国网山东省电力公司电力科学研究院 Method, system, medium and equipment for optimizing bird voice recognition model of power transmission line
CN117238299B (en) * 2023-11-14 2024-01-30 国网山东省电力公司电力科学研究院 Method, system, medium and equipment for optimizing bird voice recognition model of power transmission line
CN117727309A (en) * 2024-02-18 2024-03-19 百鸟数据科技(北京)有限责任公司 Automatic identification method for bird song species based on TDNN structure
CN117727309B (en) * 2024-02-18 2024-04-26 百鸟数据科技(北京)有限责任公司 Automatic identification method for bird song species based on TDNN structure

Similar Documents

Publication Publication Date Title
CN113707158A (en) Power grid harmful bird seed singing recognition method based on VGGish migration learning network
CN105023573B (en) It is detected using speech syllable/vowel/phone boundary of auditory attention clue
CN110718232B (en) Speech enhancement method for generating countermeasure network based on two-dimensional spectrogram and condition
CN108630209B (en) Marine organism identification method based on feature fusion and deep confidence network
CN111724770B (en) Audio keyword identification method for generating confrontation network based on deep convolution
CN108490349A (en) Motor abnormal sound detection method based on Mel frequency cepstral coefficients
CN112465069B (en) Electroencephalogram emotion classification method based on multi-scale convolution kernel CNN
WO2022088643A1 (en) Fault diagnosis method and apparatus for buried transformer substation, and electronic device
CN102982351A (en) Porcelain insulator vibrational acoustics test data sorting technique based on back propagation (BP) neural network
CN114863937A (en) Hybrid birdsong identification method based on deep migration learning and XGboost
CN112820275A (en) Automatic monitoring method for analyzing abnormality of suckling piglets based on sound signals
CN116189681B (en) Intelligent voice interaction system and method
CN111626093B (en) Method for identifying related bird species of power transmission line based on sound power spectral density
CN116861303A (en) Digital twin multisource information fusion diagnosis method for transformer substation
CN115376526A (en) Power equipment fault detection method and system based on voiceprint recognition
CN111933186B (en) Method, device and system for fault identification of on-load tap-changer
CN112329819A (en) Underwater target identification method based on multi-network fusion
CN113850013A (en) Ship radiation noise classification method
CN111860246A (en) Deep convolutional neural network-oriented data expansion method for heart sound signal classification
CN111462770A (en) L STM-based late reverberation suppression method and system
CN114818832A (en) Multi-scale feature fusion transformer voiceprint classification method
Qiu et al. Sound Recognition of Harmful Bird Species Related to Power Grid Faults Based on VGGish Transfer Learning
CN114077851A (en) FSVC-based ball mill working condition identification method
CN113611331A (en) Transformer voiceprint anomaly detection method
Unluturk et al. Emotion recognition using neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination