CN109979441A - A kind of birds recognition methods based on deep learning - Google Patents
A kind of birds recognition methods based on deep learning Download PDFInfo
- Publication number
- CN109979441A CN109979441A CN201910264817.2A CN201910264817A CN109979441A CN 109979441 A CN109979441 A CN 109979441A CN 201910264817 A CN201910264817 A CN 201910264817A CN 109979441 A CN109979441 A CN 109979441A
- Authority
- CN
- China
- Prior art keywords
- birds
- time
- deep learning
- recognition methods
- methods based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000013135 deep learning Methods 0.000 title claims abstract description 12
- 238000004458 analytical method Methods 0.000 claims abstract description 11
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 10
- 238000001228 spectrum Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims description 7
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000007613 environmental effect Effects 0.000 abstract description 3
- 238000012549 training Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 3
- 239000010410 layer Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Analysis (AREA)
Abstract
The birds recognition methods based on deep learning that the present invention relates to a kind of, belongs to birdvocalization identification technology field.It mainly comprises the steps that and time frequency analysis is carried out to variety classes chirm first, obtain the time-frequency spectrum of variety classes chirm, the characteristics of image for extracting time-frequency spectrum by convolutional neural networks again, finally passes through classifier, carries out birds Classification and Identification according to feature.This method has the ability of stronger anticrossed jam item, and resolution ratio is higher, the various changeful syllable characteristics of birds is extracted as classification foundation, characteristic parameter representativeness is stronger, weak by Environmental Noise Influence.
Description
Technical field
The birds recognition methods based on deep learning that the present invention relates to a kind of, belongs to birdvocalization identification technology field.
Background technique
The song of birds is its important biological property, identical as other morphological features of birds, due to the difference of evolution
Property, the song of birds is also unique between different plant species, so that carrying out birds identification using song is provided with feasibility.
Though birdvocalization identification technology there are many research achievements in recent years, all in all develop relatively slowly, side
There are limitations for method.Research is concentrated mainly on characteristic parameter selection, disaggregated model technique study etc., wherein common special
Sign parameter has amplitude, frequency, syllable length, sonograph, spectrogram, short-time energy, linear prediction residue error (Linear
Predictive Cepstral Coding, LPCC) and mel cepstrum coefficients (Mel-Frequency Cepstrum
Coefficient, MFCC) etc., common recognition methods and disaggregated model have dynamic time warping (Dynamic Time
Warping, DTW) algorithm, error back propagation algorithm (Error Back Propagation, BP) algorithm, hidden Markov model
(Hidden Markov Model, HMM) and gauss hybrid models (Gaussian Mixture Model, GMM) etc..There are
The problems such as characteristic parameter representativeness is not strong enough, and larger by Environmental Noise Influence.
Summary of the invention
For the shortcoming of existing method, the present invention provides a kind of birds recognition methods based on deep learning.The party
Method has the ability of stronger anticrossed jam item, and resolution ratio is higher, and the various changeful song characteristics of birds are extracted
As classification foundation, characteristic parameter representativeness is stronger, and small by Environmental Noise Influence, convolutional network is integrated in software, operates phase
To simple, recognition accuracy can also increase with the increase of convolutional neural networks training samples number.
The present invention is realized using following scheme: a kind of birds recognition methods based on deep learning, it is characterised in that including
Following steps:
Step 1, the song for acquiring variety classes bird will wherein include the segment composition of effective syllable after voice signal pretreatment
Sample database;
After step 2, sample data normalization and preemphasis processing, time-frequency spectrum is obtained by time frequency analysis algorithm;
Step 3, the characteristics of image that time-frequency spectrum is extracted by convolutional neural networks;
Step 4, by classifier, birds classification, identification are carried out according to feature;
The present invention is changing in more violent problem, pretreatment is adopted relative to conventional method in face of song segment duration
Carry out noise reduction with to signal, and cut out the various segments with complete pitch period, sing, pipe syllable, will be effective
Signal data is normalized and preemphasis, improves treatment effeciency to a certain extent, using adaptive optimal kernel time frequency analysis
Method: Adaptive optimal kernel time-frequency representation (AOK), time frequency resolution is high,
And the ability with very strong anticrossed jam item, time domain, frequency domain and the energy feature of signal can be accurately showed, volume is passed through
Product Neural Network Data data mining duty, can accurately extract the feature of time frequency analysis figure, compiled good after time frequency analysis figure gray processing
Convolutional neural networks algorithm extract feature, be input with grayscale image, the type of bird is output, and training neural network obtains most
Excellent network returns classifier through Softmax, and so that feature is multiplied propertyization to recognition result influences, and improves recognition accuracy.
Detailed description of the invention
Fig. 1 is the overall flow figure of this method.
Fig. 2 is the convolutional neural networks structural schematic diagram of this method.
Specific embodiment
In conjunction with attached drawing, to the present invention, a kind of birds recognition methods based on deep learning is described further, such as Fig. 1 institute
Show, the main foundation including chirping of birds sample database, sample preprocessing, time frequency analysis, time-frequency spectrum gray processing, convolutional neural networks are special
Sign is extracted and Softmax returns six parts of classifier, the specific steps are as follows:
Step 1, the song for acquiring variety classes bird, by voice signal noise reduction and cut, will wherein have complete cycle sound
The segment of section forms the respective sample database of every kind of birds, and for every kind of birds, randomly selecting for equivalent is part of as training
Sample;
Step 2, compiling adaptive optimal accounting method, set relevant parameter, by the normalization of training sample data, preemphasis, pre-add
Repeated factor takes 0.9375, then obtains time-frequency spectrum by adaptive optimal kernel time frequency analysis algorithm, image is carried out gray processing
Processing obtains gray matrix, to reduce neural network computing amount, adjusts the size of image, is adjusted to 64*64 herein;
Step 3, as shown in Fig. 2, herein use single layer convolutional neural networks, according to experiment, convolutional layer takes 10 convolution kernels, size
Sampling matrix size for 7*7, sub-sampling layer is 2*2, and full articulamentum connects characteristic pattern entirely, after training sample time frequency analysis
Grayscale image as input, import convolutional neural networks and extract characteristics of image, it is trained to obtain using the type of bird as outputting standard
Optimal network;
Step 4 returns classifier by Softmax, carries out birds Classification and Identification according to feature;
Above-described specific embodiment has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects
Illustrate, it should be understood that the foregoing is merely a specific embodiment of the invention, the guarantor that is not intended to limit the present invention
Range is protected, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should be included in this
Within the protection scope of invention.
Claims (4)
1. a kind of birds recognition methods based on deep learning, which comprises the following steps:
Step 1, the song for acquiring variety classes bird will wherein include the segment composition of effective syllable after voice signal pretreatment
Sample database;
After step 2, sample data normalization and preemphasis processing, time-frequency spectrum is obtained by time frequency analysis algorithm;
Step 3, the characteristics of image that time-frequency spectrum is extracted by convolutional neural networks;
Step 4, by classifier, birds Classification and Identification is carried out according to feature;
A kind of birds recognition methods based on deep learning according to claim (1), which is characterized in that described in step 1
Voice signal pretreatment includes noise reduction and cuts, and the feature of effective syllable has randomness and diversity.
2. a kind of birds recognition methods based on deep learning according to claim (1), which is characterized in that step 2 institute
It states time frequency analysis algorithm and one-dimensional clock signal is converted into two-dimentional time-frequency spectrum, and include energy information, frequency division when described
Analysis method includes but is not limited to wavelet transformation, adaptive optimal kernel etc..
3. a kind of birds recognition methods based on deep learning according to claim (1), which is characterized in that step 3 institute
Convolutional neural networks are stated first using the time-frequency spectrum of gray processing as input, characteristics of image are extracted, using the type of known bird as defeated
Standard trains the network out.
4. a kind of birds recognition methods based on deep learning according to claim (1), which is characterized in that step 4 institute
Stating classifier is that Softmax returns classifier.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910264817.2A CN109979441A (en) | 2019-04-03 | 2019-04-03 | A kind of birds recognition methods based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910264817.2A CN109979441A (en) | 2019-04-03 | 2019-04-03 | A kind of birds recognition methods based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109979441A true CN109979441A (en) | 2019-07-05 |
Family
ID=67082665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910264817.2A Pending CN109979441A (en) | 2019-04-03 | 2019-04-03 | A kind of birds recognition methods based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109979441A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110515084A (en) * | 2019-07-29 | 2019-11-29 | 生态环境部南京环境科学研究所 | A kind of field birds tag number estimate method based on acoustic imaging technology |
CN111398965A (en) * | 2020-04-09 | 2020-07-10 | 电子科技大学 | Danger signal monitoring method and system based on intelligent wearable device and wearable device |
CN112686293A (en) * | 2020-12-25 | 2021-04-20 | 广东电网有限责任公司中山供电局 | Bird intelligent identification method and system based on GMM identification model |
CN113707159A (en) * | 2021-08-02 | 2021-11-26 | 南昌大学 | Electric network bird-involved fault bird species identification method based on Mel language graph and deep learning |
CN114863938A (en) * | 2022-05-24 | 2022-08-05 | 西南石油大学 | Bird language identification method and system based on attention residual error and feature fusion |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104658538A (en) * | 2013-11-18 | 2015-05-27 | 中国计量学院 | Mobile bird recognition method based on birdsong |
CN106653032A (en) * | 2016-11-23 | 2017-05-10 | 福州大学 | Animal sound detecting method based on multiband energy distribution in low signal-to-noise-ratio environment |
CN106821337A (en) * | 2017-04-13 | 2017-06-13 | 南京理工大学 | A kind of sound of snoring source title method for having a supervision |
CN107393542A (en) * | 2017-06-28 | 2017-11-24 | 北京林业大学 | A kind of birds species identification method based on binary channels neutral net |
CN107492383A (en) * | 2017-08-07 | 2017-12-19 | 上海六界信息技术有限公司 | Screening technique, device, equipment and the storage medium of live content |
CN108197591A (en) * | 2018-01-22 | 2018-06-22 | 北京林业大学 | A kind of birds individual discrimination method based on multiple features fusion transfer learning |
CN108509939A (en) * | 2018-04-18 | 2018-09-07 | 北京大学深圳研究生院 | A kind of birds recognition methods based on deep learning |
-
2019
- 2019-04-03 CN CN201910264817.2A patent/CN109979441A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104658538A (en) * | 2013-11-18 | 2015-05-27 | 中国计量学院 | Mobile bird recognition method based on birdsong |
CN106653032A (en) * | 2016-11-23 | 2017-05-10 | 福州大学 | Animal sound detecting method based on multiband energy distribution in low signal-to-noise-ratio environment |
CN106821337A (en) * | 2017-04-13 | 2017-06-13 | 南京理工大学 | A kind of sound of snoring source title method for having a supervision |
CN107393542A (en) * | 2017-06-28 | 2017-11-24 | 北京林业大学 | A kind of birds species identification method based on binary channels neutral net |
CN107492383A (en) * | 2017-08-07 | 2017-12-19 | 上海六界信息技术有限公司 | Screening technique, device, equipment and the storage medium of live content |
CN108197591A (en) * | 2018-01-22 | 2018-06-22 | 北京林业大学 | A kind of birds individual discrimination method based on multiple features fusion transfer learning |
CN108509939A (en) * | 2018-04-18 | 2018-09-07 | 北京大学深圳研究生院 | A kind of birds recognition methods based on deep learning |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110515084A (en) * | 2019-07-29 | 2019-11-29 | 生态环境部南京环境科学研究所 | A kind of field birds tag number estimate method based on acoustic imaging technology |
CN111398965A (en) * | 2020-04-09 | 2020-07-10 | 电子科技大学 | Danger signal monitoring method and system based on intelligent wearable device and wearable device |
CN112686293A (en) * | 2020-12-25 | 2021-04-20 | 广东电网有限责任公司中山供电局 | Bird intelligent identification method and system based on GMM identification model |
CN113707159A (en) * | 2021-08-02 | 2021-11-26 | 南昌大学 | Electric network bird-involved fault bird species identification method based on Mel language graph and deep learning |
CN113707159B (en) * | 2021-08-02 | 2024-05-03 | 南昌大学 | Power grid bird-involved fault bird species identification method based on Mel language graph and deep learning |
CN114863938A (en) * | 2022-05-24 | 2022-08-05 | 西南石油大学 | Bird language identification method and system based on attention residual error and feature fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109979441A (en) | A kind of birds recognition methods based on deep learning | |
Ma et al. | Short utterance based speech language identification in intelligent vehicles with time-scale modifications and deep bottleneck features | |
CN101136199B (en) | Voice data processing method and equipment | |
Mannepalli et al. | MFCC-GMM based accent recognition system for Telugu speech signals | |
CN118711564A (en) | Synthesizing speech from text using neural networks with the voice of a target speaker | |
CN101261832B (en) | Extraction and modeling method for Chinese speech sensibility information | |
CN107610707A (en) | A kind of method for recognizing sound-groove and device | |
CN102982803A (en) | Isolated word speech recognition method based on HRSF and improved DTW algorithm | |
CN102568476B (en) | Voice conversion method based on self-organizing feature map network cluster and radial basis network | |
CN104835498A (en) | Voiceprint identification method based on multi-type combination characteristic parameters | |
CN104900235A (en) | Voiceprint recognition method based on pitch period mixed characteristic parameters | |
CN112331220A (en) | Bird real-time identification method based on deep learning | |
CN102411932B (en) | Methods for extracting and modeling Chinese speech emotion in combination with glottis excitation and sound channel modulation information | |
CN102592607A (en) | Voice converting system and method using blind voice separation | |
CN102237083A (en) | Portable interpretation system based on WinCE platform and language recognition method thereof | |
CN111916064A (en) | End-to-end neural network speech recognition model training method | |
CN102655003A (en) | Method for recognizing emotion points of Chinese pronunciation based on sound-track modulating signals MFCC (Mel Frequency Cepstrum Coefficient) | |
Nanavare et al. | Recognition of human emotions from speech processing | |
CN114495969A (en) | Voice recognition method integrating voice enhancement | |
CN110136746B (en) | Method for identifying mobile phone source in additive noise environment based on fusion features | |
Dua et al. | Optimizing integrated features for Hindi automatic speech recognition system | |
Biagetti et al. | Speaker identification in noisy conditions using short sequences of speech frames | |
Mu et al. | Voice activity detection optimized by adaptive attention span transformer | |
CN106297769B (en) | A kind of distinctive feature extracting method applied to languages identification | |
Wu et al. | A Characteristic of Speaker's Audio in the Model Space Based on Adaptive Frequency Scaling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190705 |
|
WD01 | Invention patent application deemed withdrawn after publication |